Recombination system

ABSTRACT

The present invention relates to a method for carrying out recombination at a target locus.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a §371 National Stage Application of PCT/EP2013/055048, filed Mar. 12, 2013, which claims priority to EP 12159098.8, filed Mar. 12, 2012.

BACKGROUND

Field of the Invention

The present invention relates to a method for carrying out recombination at a target locus. The method also relates to cell prepared according to a method of the invention which is carried out in vivo.

Description of Related Art

Different cell types may be used for different industrial purposes. For example: mammalian cell lines are used for antibody production; fungal cells are preferred organisms for production of polypeptides and secondary metabolites; bacterial cells are preferred for small metabolite and antibiotic production; and plant cells are preferred for taste and flavor compounds.

Recombinant techniques are widely employed for optimization of the productivity of such cells and/or the processes in which they are used. This can involve a multitude of options, including, but not limited to over expression of a gene of interest, deletion or inactivation of competing pathways, changing compartmentalization of enzymes, increasing protein or metabolite secretion, increasing organelle content and the like.

In the case of filamentous fungi, the limited availability of selectable markers complicates the construction of new cell lines. Typically, target sequences are altered in vitro to create mutant alleles with inserted antibiotic resistance markers. However, regulatory authorities in most countries object to the use of antibiotic resistance markers in view of the potential risks of spreading resistance genes to the biosphere from large-scale use of production strains carrying such genes. In addition, there is a limited number of selectable markers suitable for use in filamentous fungi.

Accordingly, selectable marker genes may need to be removed so that production strains may be used commercially and/or so that the same marker gene may be recycled for use in sequential strain modification.

SUMMARY

The invention concerns a method for carrying out recombination at a target locus, or target loci, for example within a target genome. The recombination method of the invention results in alteration of the target locus, for example the insertion of nucleic acid sequence at the target locus. The method may be carried out such that insertion of new sequence at the target locus is accompanied by removal of existing sequence from the target locus. That is to say, the method may be used to substitute a sequence at the target locus with an alternative sequence. The method may conveniently be carried out in vivo in a host cell.

Typically, when carried out in vivo, the method of the invention is not carried out on a human or animal cell. That is to say, the method of the invention is not typically carried out in the form of a method of treatment. The method of the invention may be carried out in an ex vivo or in vitro format. The terms ex vivo or in vitro should be understood to encompass methods carried out on microorganisms (both on whole living cells or on non-cellular material), but to exclude methods carried out on humans or animals.

The method is typically carried out such that at least part of the sequence inserted at the target locus is subsequently removed. If the method is carried out such that substitution of a sequence occurs at the target locus, followed by removal of the inserted sequence, the result may be deletion of sequence from the target locus.

Accordingly, the method of the invention may be carried out to achieve alteration of, the sequence of, the target locus. Such alteration may be, for example addition of new sequence, replacement of existing sequence and/or deletion/removal of existing sequence.

Typically, the invention is carried out in vivo in a host cell. The host cell may, preferably, be one which produces a compound of interest. The host cell may be capable of producing the compound of interest prior to application of the method of the invention. In this case, the method of the invention may be used to modify the target locus so that production of the compound of interest by the host cell is altered, for example production may be increased. Alternatively, the host cell may be one which produces the compound of interest as a result of application of the method of the invention.

According to the invention, there is thus provided a method for carrying out recombination at a target locus, which method comprises:

-   -   providing two or more nucleic acids which, when taken together,         comprise: (a) sequences capable of homologous recombination with         sequences flanking the target locus; (b) two or more         site-specific recombination sites; (c) a sequence encoding a         recombinase which recognizes the site-specific recombination         sites; and (d) a sequence encoding a marker,     -   wherein the two or more nucleic acids are capable of homologous         recombination with each other so as to give rise to a single         nucleic acid, and     -   wherein at least two of the two or more nucleic acids each         comprise a sequence encoding a non-functional portion of a         marker gene; and     -   recombining the said two or more nucleic acids with each other         and with the sequences flanking the target locus so that a         contiguous nucleic acid sequence encoding a functional marker         and/or essential gene and the sequence encoding the recombinase         are inserted at the target locus, said marker-encoding and/or         recombinase-encoding sequence being flanked by at least two         site-specific recombination sites and the said site-specific         recombination sites being flanked by the sequences capable of         homologous recombination with sequences flanking the target         locus.

Thus, the at least two of the two or more nucleic acids each comprising a sequence encoding a non-functional portion of a marker gene, each comprise a partial sequence, which after recombination encodes a functional marker (and wherein the parts by itself do not encode for a functional marker.

The invention also relates to a cell produced by a process according to the invention which is carried out in vivo.

The method of the invention is advantageous in comparison with current methods in that it allows the continuous use of non-counterselectable markers in strain transformations. This is advantageous, in particular in filamentous fungi, where a limited number of counterselectable markers are known and recycling of markers for repetitive use is of utmost importance. In addition,

This system allows the use of vectors which do not comprise a full knock-out cassette in a single polynucleotide. This avoids cloning problems and a more stable nucleotide fragment (a site-specific recombinase cannot work on its recombination sites in yeast or E. coli for example since they are not both present).

Also, the method of the invention is carried out using nucleic acid fragment which can be generated via amplification using automated methods. This leads to a more flexible system with higher throughputs since fragment amplification (for example by PCR) is much easier to automate then restriction digestion.

Using the method of the invention, extremely efficient strain construction may be achieved (near 100% efficiency) and the method may be used to generate multiple knock-outs more quickly than using existing techniques since multiple markers may be introduced and/or removed in one step.

Using the method of the invention, strain construction and modifications may be traceable more easily characterized since a site-specific recombination site may remain at the target locus or loci.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 sets out a schematic diagram of plasmid pDELNicB-3, which is the basis for a replacement cassette to inactivate the nicB gene in A. niger. The replacement cassette comprises the nicB flanking regions, the hygB marker cassette, mutant loxP sites and E. coli DNA. More details for pDELNicB-3 can be found in the Examples section (vide infra).

FIG. 2 sets out a schematic diagram of plasmid pDEL_PdxA-2, which is the basis for a replacement cassette to inactivate the pdxA gene in A. niger. The replacement cassette comprises the pdxA flanking regions, the ble marker cassette, mutant loxP sites and E. coli DNA. More details for pDEL_PdxA-2 can be found in the Examples section (vide infra).

FIG. 3 sets out a schematic diagram of plasmid pDEL_EPO_Hyg-1, which comprises part of a replacement cassette to inactivate the epo gene in A. niger. The replacement cassette comprises an epo flanking region, part of a hygB marker cassette, a mutant loxP site and E. coli DNA. More details for pDEL_EPO_Hyg-1 can be found in the Examples section (vide infra).

FIG. 4 sets out a schematic diagram of plasmid pDEL_EPO_CRE-1, which comprises part of a replacement cassette to inactivate the epo gene in A. niger. The replacement cassette comprises an epo flanking region, part of a hygB marker cassette, a mutant loxP site, a cre recombinase expression cassette and E. coli DNA. More details for pDEL_EPO_CRE-1 can be found in the Examples section (vide infra).

FIG. 5 sets out a schematic representation for fragment generation and use of these fragments in transformation and recombination in A. niger. In the top part is demonstrated the generation of the “bipartite left” and “bipartite right” fragments as amplified by PCR. In the lower panels, A. niger transformation through homologous recombination of the bipartite left and right fragments within the genome is shown.

FIG. 6 sets out sets out a schematic representation for fragment generation and use of these fragments in transformation and recombination in A. niger as also shown in FIG. 5. The respective bipartite fragments in this specific example differ since they comprise a Cre recombinase cassette in addition. The last panel shows resulting layout of the genomic locus after a Cre induced recombination event.

FIG. 7 sets out Cre induced loss of loxP flanked hygB selection marker. The upper plates are Cre non-induced transformants. Lower plates are Cre induced by plating on xylose as carbon source. The percentages show the percentage of marker removal in A. niger colonies after Cre induction.

FIG. 8 depicts a map of pEBA513 for transient expression of cre recombinase in fungi. pEBA513 is a pAMPF21 derived vector containing the AMA1 region and the CAT chloramphenicol resistance gene. Depicted are the cre recombinase gene (cre) expression cassette, containing the A. niger glaA promoter (Pgla), cre recombinase coding region, and niaD terminator. In addition, the hygromycin resistance cassette consisting of the A. nidulans gpdA promoter (PgpdA), hygB coding region and the P. chrysogenum penDE terminator is indicated.

FIG. 9 shows the detection of pGBTOPEBA-205 expression plasmid in the R. emersonii genome by PCR. Genomic DNA was isolated and analysed by PCR from transformant A-A4 (lanes 2-4) and the empty strain (lanes 5-7). Plasmid DNA was used as control template for the PCR reactions: pGBTOPEBA-4 (lane 8), pGBTOPEBA-8 (lane 9) and pGBTOPEBA-205 (lane 10). In the PCR reactions primers were added directed against pGBTOPEBA-4 (lanes 2, 5, and 8, expected fragment: 256 bp), pGBTOPEBA-8 (lanes 3, 6, and 9, expected fragment: 306 bp), and pGBTOPEBA-205 (lanes, 4, 7, and 10, expected fragment: 452 bp). Lanes 1 and 11 contain a molecular weight marker.

FIG. 10 shows the phenotypic and PCR analysis of marker-free R. emersonii transformants. Transformant A-A4, containing multiple R. emersonii Cbhl copies and the pDEL_Pdx-A2 plasmid carrying loxP flanked ble expression cassette, was transformed with milliQ water (control) or the pEBA513 construct for transient expression of cre-recombinase.

(FIG. 10A): Pictures of MTP plates of transformants and the empty strain grown on PDA medium with 10 μg/mlphleomycin (left panel) and PDA without selection (right panel). Row A shows two milliQ control transformants of A-A4 that contain the pDEL_Pdx-A2 with the loxP flanked ble expression cassette (lox-ble-lox). Growth of two pEBA513 transformed A-A4 transformants (lox-ble-lox+pEBA513) are shown in row B. The parental transformant A-A4 (lox-ble-lox, before transformation), was grown in row C. Finally, growth of the empty strain is shown in row D.

(FIG. 10B): PCR analysis of transformants before and after marker removal by cre-recombinase and of the cre-recombinase construct. Lanes 2 and 3 show PCR fragments obtained by PCR analysis of two milliQ control A-A4 transformants using primers directed against the pdx flanks of the ble expression construct. The 2752 bp PCR band is the expected amplified PCR fragment if the transformant still contains the ble selection marker. Lanes 5 and 6 show PCR analysis of two A-A4 transformants that were transformed with pEBA513 using primers directed against the hygB gene of the pEBA513 cre recombinase expression plasmid (314 bp fragment). Lanes 8 and 9 show PCR fragments of two A-A4 transformants that were transformed with pEBA513 using primers directed against the pdx flanks of the ble expression construct. The 881 bp PCR fragment is indicative for the deletion of the ble expression cassette from the R. emersonii transformant. Lanes 1, 4 and 7 contain a molecular weight marker.

FIG. 11 depicts the pEBA1001 vector. Part of the vector fragment was used in bipartite gene-targeting method in combination with the pEBA1002 vector with the goal to delete the ReKu80 ORF in Rasamsonia emersonii. The vector comprises a 2500 bp 5′ upstream flanking region, a lox66 site, the 5′ part of the ble coding sequence driven by the A. nidulans gpdA promoter and the backbone of pUC19 (Invitrogen, Breda, The Netherlands). The E. coli DNA was removed by digestion with restriction enzyme NotI, prior to transformation of the R. emersonii strains.

FIG. 12 depicts the pEBA1002 vector. Part of the vector fragment was used in bipartite gene-targeting method in combination with the pEBA1001 vector with the goal to delete the ReKu80 ORF in Rasamsonia emersonii. The vector comprises the 3′ part of the ble coding region, the A. nidulans trpC terminator, a lox71 site, a 2500 bp 3′ downstream flanking region of the ReKu80 ORF, and the backbone of pUC19 (Invitrogen, Breda, The Netherlands). The E. coli DNA was removed by digestion with restriction enzyme NotI, prior to transformation of the R. emersonii strains.

FIG. 13 depicts the strategy used to delete the ReKu80 gene of R. emersonii. The vectors for deletion of ReKu80 comprise the overlapping non-functional ble selection marker fragments (split marker) flanked by loxP sites and 5′ and 3′ homologous regions of the ReKu80 gene for targeting (1). The constructs integrate through triple homologous recombination (X) at the genomic ReKu80 locus and at the overlapping homologous non-functional ble selection marker fragment (2) and replaces the genomic ReKu80 gene copy (3). Subsequently, the selection marker is removed by transient expression of cre recombinase leading to recombination between the lox66 and lox71 sites resulting in the deletion of the ble gene with a remainder double-mutant lox72 site left within the genome (4). Using this overall strategy, the ReKu80 ORF is removed from the genome.

FIG. 14 shows the Southern blot analysis of ReKu80 knock out strains. Genomic DNA was isolated from strains and digested with restriction enzyme HindIII. The Southern blot was hybridized using a probe directed against the 3′ region of the ReKu80 gene. Lane 1: wild-type strain; lanes 2 and 3: two phleomycin resistant strains showing fragment size for correct ReKu80 knock out integration; lane 4: labeled molecular weight marker; lanes 5 and 6: two phleomycin sensitive strains with fragment size for correct ReKu80 knock out integration.

FIG. 15 depicts the pEBA1005 vector that was used in bipartite gene-targeting method in combination with the pEBA1006 vector with the goal to delete the RePepA ORF in Rasamsonia emersonii. The vector comprises a 2500 bp 5′ flanking region, a lox66 site, the 5′ part of the ble coding region driven by the A. nidulans gpdA promoter and the backbone of pUC19 (Invitrogen, Breda, The Netherlands).

FIG. 16 depicts the pEBA1006 vector that was used in bipartite gene-targeting method in combination with the pEBA1005 vector with the goal to delete the RePepA ORF in Rasamsonia emersonii. The vector comprises the 3′ part of the ble coding region, the A. nidulans trpC terminator, a lox71 site, a 2500 bp 3′ flanking region of the ReKu80 ORF, and the backbone of pUC19 (Invitrogen, Breda, The Netherlands).

FIG. 17 depicts the pEBA10056 vector that was used to delete the RePepA ORF in Rasamsonia emersonii. The vector comprises a 2500 bp 5′ flanking region, a lox66 site, the ble expression cassette consisting of the A. nidulans gpdA promoter, ble coding region and A. nidulans trpC terminator, a lox71 site, a 2500 bp 3′ flanking region of the ReKu80 ORF, and the backbone of pUC19 (Invitrogen, Breda, The Netherlands).

FIG. 18 shows a picture of PDA plates supplemented with 1% Casein sodium salt with TEC-142S and the deltaReKu80-2 strains transformed with RePepA deletion constructs containing 2.5 kb flanks.

FIG. 19 sets out a schematic diagram of plasmid pPepAHyg, which comprises part of a replacement cassette to inactivate the RePepA gene in R. emersonii. The replacement cassette comprises a 1500 nuleotice RePepA 5′ flanking region, part of a hygB marker cassette, a mutant loxP site and E. coli DNA. More details for pPepAHyg can be found in the Examples section (vide infra).

FIG. 20 sets out a schematic diagram of plasmid pPepACre, which comprises part of a replacement cassette to inactivate the RePepA gene in R. emersonii. The replacement cassette comprises a RePepA 3′ flanking region, part of a hygB marker cassette, a mutant loxP site, a cre recombinase expression cassette and E. coli DNA. More details for pPepACre can be found in the Examples section (vide infra).

FIG. 21 sets out a schematic representation for fragment use in transformation and recombination in R. emersonii. The vectors for deletion of RePepA comprise the overlapping non-functional hygB selection marker fragments (split marker) flanked by loxP sites and 5′ and 3′ homologous regions of the RePepA gene for targeting (1). The constructs integrate through triple homologous recombination (X) at the genomic RePepA locus and at the overlapping homologous non-functional hygB selection marker fragment (2) and replaces the genomic RePepA gene copy (3). Subsequently, the selection marker is removed by growing transformants on xylose to induce cre recombinase expression leading to recombination between the lox66 and lox71 sites resulting in the deletion of the hygB and Cre expression cassettes with a remainder double-mutant lox72 site left within the genome (4).

FIG. 22 sets out Cre induced loss of loxP flanked hygB selection marker in Rasamsonia emersonii. Transformants carrying the loxP flanked hygB selection marker and cre recombinase expression cassette integrated into the RePepA locus were plated on xylose as carbon source to induce cre recombinase. After cre induction colonies were transferred to PDA without selection (left) and PDA with hygromycinB selection (right). An empty strain was included as a control for selection.

FIG. 23 sets out a schematic procedure for knocking-out the ADE1 gene in S. cerevisiae using a bipartite marker and cre-recombinase with inducible promoter all flanked between lox71 and lox66

BRIEF DESCRIPTION OF SEQUENCE LISTING

SEQ ID No: 1 sets out the mutant lox P site, lox66.

SEQ ID No: 2 sets out the mutant lox P site, lox71.

SEQ ID NO: 3 sets out the double-mutant lox72 site.

SEQ ID NO: 4 sets out a first non-functional hygB marker fragment (PgpdA-HygB sequence missing the last 27 bases of the coding sequence at the 3′ end of hygB).

SEQ ID NO: 5 sets out a second non-functional hygB fragment (HygB-TtrpC sequence missing the first 44 bases of the coding sequence at the 5′ end of hygB).

SEQ ID NO: 6 sets out the cre recombinase cassette containing the A. nidulans xylanase A promoter, a cre recombinase and xylanase A terminator, to allow xylose-inducible expression of the cre recombinase.

SEQ ID NO: 7 sets out the DNA sequence of the Ble-forward PCR primer;

SEQ ID NO: 8 sets out the DNA sequence of the Ble-reverse PCR primer;

SEQ ID NO: 9 sets out the DNA sequence of the EBA205-forward PCR primer;

SEQ ID NO: 10 sets out the DNA sequence of the EBA205-reverse PCR primer;

SEQ ID NO: 11 sets out the DNA sequence of the pGBTOPEBA4-forward PCR primer;

SEQ ID NO: 12 sets out the DNA sequence of the pGBTOPEBA4-reverse PCR primer;

SEQ ID NO: 13 sets out the DNA sequence of the pGBTOPEBA8-forward PCR;

SEQ ID NO: 14 sets out the DNA sequence of the pGBTOPEBA8-reverse PCR;

SEQ ID NO: 15 sets out the DNA sequence of the Pdx-forward PCR primer;

SEQ ID NO: 16 sets out the DNA sequence of the Pdx-reverse PCR primer;

SEQ ID NO: 17 sets out the DNA sequence of the Hyg-forward PCR primer;

SEQ ID NO: 18 sets out the DNA sequence of the Hyg-reverse PCR primer;

SEQ ID NO: 19 sets out the nucleic acid sequence of the ReKu70 genomic region including flanking sequence.

SEQ ID NO: 20 sets out the nucleic acid sequence of the ReKu70 cDNA.

SEQ ID NO: 21 sets out the amino acid sequence of the ReKu70 polypeptide.

SEQ ID NO: 22 sets out the nucleic acid sequence of the ReKu80 genomic region including flanking sequence.

SEQ ID NO: 23 sets out the nucleic acid sequence of the ReKu80 cDNA.

SEQ ID NO: 24 sets out the amino acid sequence of the ReKu80 polypeptide.

SEQ ID NO: 25 sets out the nucleic acid sequence of the ReRad50 genomic region including flanking sequence.

SEQ ID NO: 26 sets out the nucleic acid sequence of the ReRad50 cDNA.

SEQ ID NO: 27 sets out the amino acid sequence of the ReRad50 polypeptide.

SEQ ID NO: 28 sets out the nucleic acid sequence of the ReRad51 genomic region including flanking sequence.

SEQ ID NO: 29 sets out the nucleic acid sequence of the ReRad51 cDNA.

SEQ ID NO: 30 sets out the amino acid sequence of the ReRad51 polypeptide.

SEQ ID NO: 31 sets out the nucleic acid sequence of the ReRad52 genomic region including flanking sequence.

SEQ ID NO: 32 sets out the nucleic acid sequence of the ReRad52 cDNA.

SEQ ID NO: 33 sets out the amino acid sequence of the ReRad52 polypeptide.

SEQ ID NO: 34 sets out the nucleic acid sequence of the ReRad54a genomic region including flanking sequence.

SEQ ID NO: 35 sets out the nucleic acid sequence of the ReRad54a cDNA.

SEQ ID NO: 36 sets out the amino acid sequence of the ReRad54a polypeptide.

SEQ ID NO: 37 sets out the nucleic acid sequence of the ReRad54b genomic region including flanking sequence.

SEQ ID NO: 38 sets out the nucleic acid sequence of the ReRad54b cDNA.

SEQ ID NO: 39 sets out the amino acid sequence of the ReRad54b polypeptide.

SEQ ID NO: 40 sets out the nucleic acid sequence of the ReRad55 genomic region including flanking sequence.

SEQ ID NO: 41 sets out the nucleic acid sequence of the ReRad55 cDNA.

SEQ ID NO: 42 sets out the amino acid sequence of the ReRad55 polypeptide.

SEQ ID NO: 43 sets out the nucleic acid sequence of the ReRad57 genomic region including flanking sequence.

SEQ ID NO: 44 sets out the nucleic acid sequence of the ReRad57 cDNA.

SEQ ID NO: 45 sets out the amino acid sequence of the ReRad57 polypeptide.

SEQ ID NO: 46 sets out the nucleic acid sequence of the ReCDC2 genomic region including flanking sequence.

SEQ ID NO: 47 sets out the nucleic acid sequence of the ReCDC2 cDNA.

SEQ ID NO: 48 sets out the amino acid sequence of the ReCDC2 polypeptide.

SEQ ID NO: 49 sets out the nucleic acid sequence of the ReLIG4 genomic region including flanking sequence.

SEQ ID NO: 50 sets out the nucleic acid sequence of the ReLIG4 cDNA.

SEQ ID NO: 51 sets out the amino acid sequence of the ReLIG4 polypeptide.

SEQ ID NO: 52 sets out the nucleic acid sequence of the ReMRE11 genomic region including flanking sequence.

SEQ ID NO: 53 sets out the nucleic acid sequence of the ReMRE11 cDNA.

SEQ ID NO: 54 sets out the amino acid sequence of the ReMRE11 polypeptide.

SEQ ID NO 55: sets out the DNA sequence of the Ku80-forward PCR primer;

SEQ ID NO 56: sets out the DNA sequence of the Ku80-reverse PCR primer.

SEQ ID NO: 57 sets out the nucleic acid sequence the Rasamsonia emersonii pepA genomic region+flanks.

SEQ ID NO: 58 sets out sets out the nucleic acid sequence of the Rasamsonia emersonii pepA cDNA.

SEQ ID NO: 59 sets out sets out the amino acid sequence of the Rasamsonia emersonii pepA polypeptide.

SEQ ID NO: 60 sets out a first non-functional ble marker fragment (PgpdA-ble sequence missing the last 104 bases of the coding sequence at the 3′ end of ble).

SEQ ID NO: 61 sets out a second non-functional ble fragment (ble-TtrpC sequence missing the first 12 bases of the coding sequence at the 5′ end of ble). SEQ ID NO: 62 sets out the sequence of basic construct 1

SEQ ID NO: 63 sets out the Sequence of basic construct 2

SEQ ID NO: 64 sets out the sequence of forward primer Left ADE1 KO flank

SEQ ID NO: 65 sets out the sequence of the reverse primer left ADE1 KO flank

SEQ ID NO: 66 sets out the sequence of the forward primer basic construct 1 with 50 bp ADE1 KO flank

SEQ ID NO: 67 sets out the sequence of the reverse primer basic construct 1

SEQ ID NO: 68 sets out the sequence of the forward primer basic construct 2 creating overlap towards basic construct 1

SEQ ID NO: 69 sets out the sequence of the reverse primer basic construct 2 with 50 bp ADE1 KO flank

SEQ ID NO: 70 sets out the sequence of the forward primer for amplification of right ADE1 KO flank

SEQ ID NO: 71 sets out the sequence of the reverse primer for amplification of right ADE1 KO flank

SEQ ID NO: 72 sets out the sequence of the PCR fragment 1 left flank ADE1

SEQ ID NO: 73 sets out the sequence of the PCR fragment 2 Basic construct 1 with ADE1 KO flanks

SEQ ID NO: 74 sets out the sequence of the PCR fragment 3 Basic construct 2 with ADE1 KO flanks

SEQ ID NO: 75 sets out the sequence of the PCR fragment left flank ADE1

DETAILED DESCRIPTION OF A PREFERRED EMBODIMENT

Throughout the present specification and the accompanying claims, the words “comprise”, “include” and “having” and variations such as “comprises”, “comprising”, “includes” and “including” are to be interpreted inclusively. That is, these words are intended to convey the possible inclusion of other elements or integers not specifically recited, where the context allows.

The articles “a” and “an” are used herein to refer to one or to more than one (i.e. to one or at least one) of the grammatical object of the article. By way of example, “an element” may mean one element or more than one element.

The method according to the invention is one for carrying out recombination at a target locus. Recombination refers to a process in which a molecule of nucleic acid is broken and then joined to a different one. The recombination process of the invention typically involves the artificial and deliberate recombination of disparate nucleic acid molecules, which may be from the same or different organism, so as to create recombinant nucleic acids.

The term “recombinant” means, for example, that a nucleic acid sequence is made by an artificial combination of two otherwise separated segments of sequence, e.g., by chemical synthesis or by the manipulation of isolated nucleic acids by genetic engineering techniques.

The method of the invention relies on a combination of homologous recombination and site-specific recombination.

“Homologous recombination” refers to a reaction between nucleotide sequences having corresponding sites containing a similar nucleotide sequence (i.e., homologous sequences) through which the molecules can interact (recombine) to form a new, recombinant nucleic acid sequence. The sites of similar nucleotide sequence are each referred to herein as a “homologous sequence”. Generally, the frequency of homologous recombination increases as the length of the homology sequence increases. Thus, while homologous recombination can occur between two nucleic acid sequences that are less than identical, the recombination frequency (or efficiency) declines as the divergence between the two sequences increases. Recombination may be accomplished using one homology sequence on each of two molecules to be combined, thereby generating a “single-crossover” recombination product. Alternatively, two homology sequences may be placed on each of two molecules to be recombined. Recombination between two homology sequences on the donor with two homology sequences on the target generates a “double-crossover” recombination product.

If the homology sequences on the donor molecule flank a sequence that is to be manipulated (e.g., a sequence of interest), the double-crossover recombination with the target molecule will result in a recombination product wherein the sequence of interest replaces a DNA sequence that was originally between the homology sequences on the target molecule.

“Site-specific recombination”, also known as conservative site-specific recombination, is a type of recombination in which nucleic acid strand exchange takes place between segments possessing only a limited degree of sequence homology. Site-specific recombinase enzymes perform rearrangements of nucleic acid segments by recognizing and binding to short DNA sequences (sites), at which they cleave the DNA backbone, exchange the two DNA helices involved and rejoin the DNA strands. In some site-specific recombination systems having just a recombinase enzyme together with the recombination sites is enough to perform all these reactions, in some other systems a number of accessory proteins and accessory sites may also needed.

The method may be use to carry out recombination at a target locus resulting in modification of that target locus. Accordingly, the invention may be used to add, delete or otherwise change a target locus. The target locus may be a coding or a non-coding sequence. The method of the invention may be used so that such coding or non-coding sequence may be disrupted and/or partially or fully deleted and/or replaced. Thus, the method of the invention may be used to replace sequence at target locus, for example with a marker-encoding sequence.

Typically, the invention is carried out in vivo in a host cell (such as a cell of a microorganism). The host cell may, preferably, be one which produces a compound of interest. The host cell may be capable of producing the compound of interest prior to application of the method of the invention. In this case, the method of the invention may be used to modify the target locus so that production of the compound of interest by the host cell is altered, for example production may be increased. Alternatively, the host cell may be one which produces the compound of interest as a result of application of the method of the invention.

Accordingly, the invention may be used, for example, in the optimization of the productivity of a host cell and/or the processes in which they are used. Alternatively, the invention may be used, for example, to introduce novel nucleic acids such that the host cell is rendered capable of producing a new compound of interest. The invention may be used sequentially, such that a plurality of novel nucleic acid sequences is introduced into the host cell, resulting in the introduction of an entirely new pathway or metabolic pathway.

A target locus may be any nucleic sequence which is to be modified. Typically, the target locus may be a sequence within a genome (the complete genetic material of an organism), for example a locus on a chromosome. Such a chromosome could be a linear or a circular chromosome. However, the target locus could be extrachromosomal for example a locus on a plasmid, a minichromosome or artificial chromosome. The target locus may be located on a plasmid, a phage, or any other nucleic acid sequence which is able to replicate or be replicated in vitro or in a host cell

The method of the invention may be carried out in vitro, ex vivo or in vivo.

The method of the invention comprises:

-   -   providing two or more nucleic acids which, when taken together,         comprise: (a) sequences capable of homologous recombination with         sequences flanking the target locus; (b) two or more         site-specific recombination sites; (c) a sequence encoding a         recombinase which recognizes the site-specific recombination         sites; and (d) a sequence encoding a marker,     -   wherein the two or more nucleic acids are capable of homologous         recombination with each other so as to give rise to a single         nucleic acid, and     -   wherein at least two of the two or more nucleic acids each         comprise a sequence encoding a non-functional portion of the         marker; and     -   recombining the said two or more nucleic acids with each other         and with the sequences flanking the target locus so that a         contiguous nucleic acid sequence encoding a functional marker         and the sequence encoding the recombinase are inserted at the         target locus, said marker-encoding and/or recombinase-encoding         sequence being flanked by at least two site-specific         recombination sites and the said site-specific recombination         sites being flanked by the sequences capable of homologous         recombination with sequences flanking the target locus.

In the invention, at least two of the two or more nucleic acids each comprise a sequence encoding a non-functional portion of a marker. That is to say, the marker-encoding sequence is split across at least two of the two or more nucleic acids. Accordingly, the method may be referred to as a split-marker approach.

Out-recombination of the nucleic acid sequence between the site-specific recombination sites, for example of the marker, may be carried out in vivo.

In the method of the invention, the in vivo recombination may be carried out in any suitable host cell, for example carried out in a prokaryotic or a eukaryotic cell.

In the method of the invention, recombination of the nucleic acids with each other and with the target locus is carried out in vivo.

In the method of the invention, two or more nucleic acids are provided. Taken together, the two or more nucleic acids provide: (a) sequences capable of homologous recombination with sequences flanking the target locus; (b) two or more site-specific recombination sites; (c) a sequence encoding a recombinase which recognizes the site-specific recombination sites; and (d) a sequence encoding a marker

It is not intended that each of the two or more nucleic acids comprises the sequences set out in (a), (b), (c) and (d). Rather, the sequences set out in (a), (b), (c) and (d) must be comprised by the two or more nucleic acids when those nucleic acids are taken together as a group. Thus, one nucleic acid may comprise one or more of the sequences set out in (a), (b), (c) (d) and a second nucleic acid may comprise the other sequences set out in (a), (b), (c) and (d). Typically, each of the two or more nucleic acids will comprise at least one of the sequences set out in (a), (b), (c) and (d). However, additional nucleic acids may be provided which do not comprise at least one of the sequences set out in (a), (b), (c) or (d).

One format for the method is set out in FIG. 6 in which two nucleic acids are used, but the skilled person will readily be able to conceive of further formats. The number of nucleic acids used in the method may be two, three, four, five, six or more.

Typically, the marker-encoding sequence will be split over two nucleic acid sequences (each of these two nucleic acid sequences encoding a non-functional portion of the marker, but when recombined the two will encode a functional marker). However, the marker-encoding sequence could be split of three, four or more nucleic acid sequences.

When the marker-encoding sequence is split over two nucleic acid sequences, each of those two sequences may typically also comprise a site-specific recombination site. This approach is shown is FIG. 6. Alternatively, the site-specific recombination sites may be provided on additional nucleic acid sequences capable of recombining with the nuclei acid sequences comprising the marker-encoding sequence.

In the method of the invention, the two or more nucleic acids are capable of homologous recombination with each other so as to give rise to a single nucleic acid. The nucleic acids are incorporated as a single contiguous sequence at a target locus due to the presence of the sequences capable of homologous recombination with sequences flanking the target locus. In addition, at least two of the two or more nucleic acids each comprise a sequence encoding a non-functional portion of the marker.

Accordingly, in the method of the invention, the two or more nucleic acids are recombined with each other and with sequences flanking the target locus. In this way, a contiguous nucleic acid sequence encoding a functional marker may be inserted at the target locus together with a recombinase-encoding sequence and at least two site-specific recombination sites. This functional marker-encoding sequence is typically inserted at the target locus such that it is flanked by at least two site-specific recombination sites. When the recombinase is expressed, the sequence situated between the site-specific recombination sites may be out-recombined. If the marker-encoding and/or recombinase-encoding sequence is located between the site-specific recombination sites, it/they will be out-recombined. However, if the marker-encoding and/or recombinase-encoding sequence sequence lies outside the site-specific recombination sites, it will be retained at the target locus.

When recombination has taken place, the site-specific recombination sites, marker and recombinase sequence will be flanked by the sequences capable of homologous recombination with sequences flanking the target locus.

It also be possible to carry out the method of the invention by adding the recombinase separately, using for example a plasmid (comprising a sequence encoding the recombinase), or by use of direct addition of a recombinase protein.

The method of the invention may be carried out so that more than one, for example two, three, four, five or more target loci are targeted simultaneously. In such a method, the two or more nucleic acids, when taken together, comprise sequences capable of homologous recombination with sequences flanking two or more target loci. In this way, recombination of the said two or more nucleic acids with each other and with the sequences flanking the target loci results in the insertion of at least two site-specific recombination sites at each target loci. The two or more nucleic acids provided are such that a nucleic acid sequence encoding a functional recombinase is inserted in at least one target locus, optionally located between at least two site-specific recombination sites. It is not necessary for other target loci to comprise a function recombinase-encoding sequence, but each target loci will comprise at least two site-specific recombination sites (which may be targeted by the recombinase). At least two nucleic acids are provided, each comprising sequence encoding a non-functional marker. Thus, one or more functional marker-encoding sequences may be inserted at one or more of the target loci. The method of the invention may though be carried out such that a sequence encoding a functional marker is inserted at all or some of the target loci.

Again, at each target locus, the said site-specific recombination sites and any marker-encoding and recombinase-encoding sequence will be flanked by the sequences capable of homologous recombination with sequences flanking the target locus.

In the method of the invention, the two or more nucleic acids are capable of homologous recombination with each other so as to give rise to a single nucleic acid. The nucleic acids are incorporated as a single contiguous sequence at a target locus due to the presence of the sequences capable of homologous recombination with sequences flanking the target locus.

In more detail, the two or more nucleic acids provided in the invention, when taken together, comprise sequences capable of homologous recombination two or more homologous recombination sites directed against the target locus. Where the method targets a single target locus typically, the two or more nucleic acids will provide two such sequences. These sequences are provided such that a contiguous nucleic acid sequence comprising the at least two or more nucleic acids (when recombined with each other) is inserted at the target locus via recombination with substantially homologous sequences which flank the target sequence.

It will be obvious to the skilled person that, in order to achieve homologous recombination via a double cross-over event, these flanking sequences need to be present at both sides/ends of the contiguous sequence resulting from recombination of the two or more nucleic acids and need to be substantially homologous to sequences at both sides of the target loci. Thus, the sequences capable of homologous recombination are typically provided such that they are located at the “5′” and “3′” ends of the nucleic acid sequence resulting from recombination of the two or more nucleic acids.

Moreover, the at least two nucleic acids provided according to the invention are capable of undergoing recombination with each other. Thus, the ends of the nucleic acids are conveniently designed such that this may take place and that the nucleic acids will be assembled in the desired orientation and order. Accordingly the sequence of the ends of a provided nucleic acid will be substantially homologous to the sequences of the ends of the nucleic acids with which it is intended to be recombined.

With the term “substantially homologous” as used in this invention is meant that a first nucleic acid sequence has a degree of identity with a second nucleic acid sequence with which it is to be recombined of at least about 70%, at least about 80%, preferably at least about 90%, at least 95%, at least 98%, at least 99%, most preferably 100% over a region of not more than about 3 kb, preferably not more than about 2 kb, more preferably not more than about 1 kb, even more preferably not more than about 0.5 kb, even more preferably not more than about 0.2 kb, even more preferably not more than about 0.1 kb, such not more than about 0.05 kb, for example not more than about 0.03 kb. In filamentous fungi, the optimal size may be from about 500 bp to about 2.5 kb. The degree of required identity may thereby depend on the length of the substantially homologous sequence. The shorter the homologous sequence, the higher the percentage homology may be.

In the invention, the two or more nucleic acids, taken together, comprise two or more site-specific recombination sites. These site-specific recombination sites are recognised by a recombinase which is encoded by the two or more nucleic acids, taken together.

The site-specific recombination sites and recombinase are selected such that the recombinase may target the site-specific recombination sites leading to out-recombination of sequence locate between the recombination sites.

The terms “recombinase” or “site-specific recombinase” or the like refers to enzymes or recombinases that recognize and bind to a short nucleic acid site or “site-specific recombinase site”, i.e., a recombinase recognition site, and catalyze the recombination of nucleic acid in relation to these sites. These enzymes include recombinases, transposases and integrases.

The “site-specific recombinase site” or the like refers to short nucleic acid sites or sequences, i.e., recombinase recognition sites, which are recognized by a sequence- or site-specific recombinase and which become the crossover regions during a site-specific recombination event. Examples of sequence-specific recombinase target sites include, but are not limited to, lox sites, att sites, dif sites and frt sites.

The term “lox site” as used herein refers to a nucleotide sequence at which the product of the cre gene of bacteriophage P1, the Cre recombinase, can catalyze a site-specific recombination event. A variety of lox sites are known in the art, including the naturally occurring loxP, loxB, loxL and loxR, as well as a number of mutant, or variant, lox sites, such as lox66, lox71, loxP511, loxP514, lox486, lox4117, loxC2, loxP2, loxP3 and lox P23.

The term “frt site” as used herein refers to a nucleotide sequence at which the product of the FLP gene of the yeast 2 micron plasmid, FLP recombinase, can catalyze site-specific recombination.

The site-specific recombination sites may be such that out-recombination following recombinase expression gives rise to a single mutant site-specific recombination site at the target locus which is not recognized by the recombinase. In particular, the lox sites may be lox66 and lox 71 (Albert, H., Dale, E. C., Lee, E., & Ow, D. W. (1995). Site-specific integration of DNA into wild-type and mutant lox sites placed in the plant genome. Plant Journal, 7(4), 649-659). In a specific embodiment, the lox66 and lox71 site-specific recombination sites may be such that out-recombination following recombinase expression gives rise to a lox72 mutant site-specific recombination site at the target locus which is not recognized by the recombinase.

In addition to the recombinase, site-specific recombination site and sequences capable of homologous recombination with sequences flanking the target locus, the method of the invention is carried out, wherein the two or more nucleic acids, taken together, comprise a marker-encoding sequence such that recombination of the two or more nucleic acids results in the said marker gene-encoding sequence being inserted at the target locus or loci. Such a marker-encoding sequence may be located between the at least two of the sequences capable of homologous recombination with sequences flanking the target locus or loci.

Critically, the two or more nucleic acids are provided so that at least two of the nucleic acids each comprise a sequence encoding a non-functional part of the marker-encoding sequence. When the two or more nucleic acids are recombined, this gives rise to a contiguous sequence encoding a functional marker. Accordingly, the method of the invention is referred to as a “split-marker” approach.

Non-functional in the context of this invention refers a sequence which does not encode a product capable of acting as a functional selection marker.

Typically, the method may be carried out so that a marker-encoding sequence is located between two or more site-specific recombination sites. In this way, the marker gene may be out-recombined on expression of the recombinase. Accordingly the method can be used for dominant markers and counter selectable markers.

In this way, the method may be carried out in a repeated fashion with more than one cycle of homologous recombination with sequences flanking the target locus followed by out-recombination following recombinase expression using the same marker. This approach may be further combined with the use of mutant site-specific recombination sites which cannot be targeted by the recombinase once the marker has out-recombined.

One advantage of the invention is that it allows multiple recombination events to be carried out simultaneously, sequentially or separately.

Accordingly the method may be carried out in a repeated fashion with more than one cycle of recombination using the same marker. Accordingly the method is especially applicable if a limited set of markers is available. This approach may be further combined with the use of mutant site-specific recombination sites which cannot be targeted by the recombinase once the marker has out-recombined. This allows multiple sites to be targeted and the amount of sites targeted is not limited by the availability of different markers since the marker is eliminated via activation of the recombinase.

In a method of the invention, the two or more nucleic acids, taken together, may comprise two or more different marker-encoding sequences such that recombination of the two or more nucleic acids results in two or more different marker gene-encoding sequence being inserted at a target locus or loci. This method may be carried out where sequences capable of homologous recombination with sequences flanking two or more target loci are provided. It is further possible, that one marker may be used to target at least two target loci and a different marker used to target a one or more further target loci.

In a method of the invention, one of the marker-encoding sequences will be split. In another preferred embodiment of the invention, two or more or even all of the marker-encoding sequences will typically be split. That is to say, for each marker the two or more nucleic acids are provided so that at least two of the nucleic acids each comprise a sequence encoding a non-functional part of the marker-encoding sequence. When the two or more nucleic acids are recombined, this gives rise to a contiguous sequence encoding a functional marker. A method of the invention may include at least one split marker. Typically, all marker-encoding sequences used are provided in a split format.

The method may be carried out such that one or more identical or non-identical markers, each marker being flanked by lox sites, are recombined into a cell. The method of the invention may then be used to provide a further recombination event and at the same time remove all of such markers.

In the method of the invention, the target locus may comprise a coding sequence which is disrupted and/or partially or fully deleted. Typically, the method adds new sequence at the target locus; this new sequence will typically replace, delete and/or modify a sequence at the target locus.

As set out above, the replacement sequence may for instance confer a selectable phenotype when the recombination is carried out in vivo in a host cell. In that case, the replacement sequence comprises a selection marker. Preferentially, such a method is carried out so that the marker may be out-recombined on expression of the recombinase.

The replacement sequence may also be a modified version of the target sequence, for instance to provide for altered regulation of a sequence of interest or expression of a modified gene product with altered properties as compared to the original gene product.

The replacement sequence may also constitute additional copies of a sequence of interest being present in the genome of the host cell, to obtain amplification of that sequence of interest.

The replacement sequence may be a sequence homologous or heterologous to the host cell. It may be obtainable from any suitable source or may be prepared by custom synthesis.

The target sequence may be any sequence of interest. For instance, the target sequence may be a sequence of which the function is to be investigated by inactivating or modifying the sequence. The target sequence may also be a sequence of which inactivation, modification or over expression is desirable to confer on the host cell with a desired phenotype. Typically, the method of the invention will result in some nucleic acid sequence being removed at the target locus. However, the method of the invention may be used to insert sequence at the target locus without any sequence being lost from the target locus.

In the context of this disclosure, the terms “nucleic acid”, “nucleic acid sequence”, “polynucleotide”, “polynucleotide sequence”, “nucleic acid fragment”, “isolated nucleic acid fragment” may be used interchangeably herein.

These terms encompass nucleotide sequences and the like. A nucleic acid may be a polymer of DNA or RNA that may be single- or double-stranded, that optionally contains synthetic, non-natural or altered nucleotide bases.

A nucleic acid in the form of a polymer of DNA may be comprised of one or more segments of cDNA, genomic DNA, synthetic DNA, or mixtures thereof.

The term “isolated nucleic acid” and the like refers to a nucleic acid that is substantially free from other nucleic acid sequences, such as and not limited to other chromosomal and extrachromosomal DNA and/or RNA. Isolated nucleic acids may be purified from a host cell in which they naturally occur.

Conventional nucleic acid purification methods known to skilled artisans may be used to obtain isolated nucleic acids. The term also embraces recombinant nucleic acids and chemically synthesized nucleic acids. Typically, each of the two or more nucleic acids suitable for use in the invention may be generated by any amplification process known in the art (e.g., PCR, RT-PCR and the like). The terms “amplify”, “amplification”, “amplification reaction”, or “amplifying” as used herein refer to any in vitro process for multiplying the copies of a target sequence of nucleic acid. Amplification sometimes refers to an “exponential” increase in target nucleic acid. However, “amplifying” as used herein can also refer to linear increases in the numbers of a select target sequence of nucleic acid, but is typically different than a one-time, single primer extension step.

The two or more nucleic acids are typically introduced into a host cell so that the recombination events may take place. The two or more nucleic acids can be introduced into a host cell using various techniques which are well-known to those skilled in the art. Non-limiting examples of methods used to introduce heterologous nucleic acids into various organisms include; transformation, transfection, transduction, electroporation, ultrasound-mediated transformation, particle bombardment and the like. In some instances the addition of carrier molecules can increase the uptake of DNA in cells typically though to be difficult to transform by conventional methods. Conventional methods of transformation are readily available to the skilled person.

The procedures used to generate the two or more nucleic acids and to then introduce them into a host cell are well known to one skilled in the art (see, e.g. Sambrook & Russell, Molecular Cloning: A Laboratory Manual, 3rd Ed., CSHL Press, Cold Spring Harbor, N.Y., 2001; and Ausubel et al., Current Protocols in Molecular Biology, Wiley InterScience, NY, 1995).

Furthermore, standard molecular biology techniques such as DNA isolation, gel electrophoresis, enzymatic restriction modifications of nucleic acids, Southern analyses, transformation of cells, etc., are known to the skilled person and are for example described by Sambrook et al. (1989) “Molecular Cloning: a laboratory manual”, Cold Spring Harbor Laboratories, Cold Spring Harbor, N.Y. and Innis et al. (1990) “PCR protocols, a guide to methods and applications” Academic Press, San Diego.

A nucleic acid suitable for use in the method of the invention may be amplified using cDNA, mRNA or alternatively, genomic DNA, as a template and appropriate oligonucleotide primers according to standard PCR amplification techniques. The nucleic acid so amplified can be cloned into an appropriate vector if desirable and/or characterized by nucleic acid sequence analysis.

The method of the invention may be carried out such that the two or more nucleic acids are recombined as a single nucleic acid which is then recombined with the target locus.

The method of the invention may be carried out where recombination of the said two or more nucleic acids with each other and with the target locus takes place simultaneously.

In a method of the invention two of the at least two nucleic acids may each comprise a part of the marker-encoding sequence such that together they comprise the entire marker-encoding sequence.

The method of the invention may be carried out so that the recombinase directed against the site-specific recombination sites is expressed such that the sequence located between the two site-specific recombination sites is out-recombined.

The expression of the marker and recombinase will typically be under the control of control sequences including a promoter which enable expression of the recombinase within the host cell. That is to say, the marker- and recombinase-encoding sequences will typically be in operable linkage with a promoter sequence.

The term “operable linkage” or “operably linked” or the like are defined herein as a configuration in which a control sequence is appropriately placed at a position relative to the coding sequence of the DNA sequence such that the control sequence directs the production of an mRNA or a polypeptide.

The term “control sequences” is defined herein to include all components, which are necessary or advantageous for the production of mRNA or a polypeptide, either in vitro or in a host cell. Each control sequence may be native or foreign to the nucleic acid sequence encoding the polypeptide. Such control sequences include, but are not limited to, a leader, Shine-Delgarno sequence, optimal translation initiation sequences (as described in Kozak, 1991, J. Biol. Chem. 266:19867-19870), a polyadenylation sequence, a pro-peptide sequence, a pre-pro-peptide sequence, a promoter, a signal sequence, and a transcription terminator. At a minimum, the control sequences include a promoter, and a transcriptional stop signal as well as translational start and stop signals. Control sequences may be optimized to their specific purpose. Preferred optimized control sequences used in the present invention are those described in WO2006/077258.

The term “promoter” is defined herein as a DNA sequence that binds RNA polymerase and directs the polymerase to the correct downstream transcriptional start site of a nucleic acid sequence encoding a biological compound to initiate transcription. RNA polymerase effectively catalyzes the assembly of messenger RNA complementary to the appropriate DNA strand of a coding region. The term “promoter” will also be understood to include the 5′-non-coding region (between promoter and translation start) for translation after transcription into mRNA, cis-acting transcription control elements such as enhancers, and other nucleotide sequences capable of interacting with transcription factors.

Accordingly, a marker may be split by providing a promoter on a first nucleic acid and the coding sequence on a second nucleic acid such that the promoter and coding sequence are brought into operable linkage on recombination, i.e. recombination will give rise to a functional marker-encoding sequence.

The promoter may be any appropriate promoter sequence suitable for a eukaryotic or prokaryotic host cell, which shows transcriptional activity, including mutant, truncated, and hybrid promoters, and may be obtained from polynucleotides encoding extra-cellular or intracellular polypeptides either homologous (native) or heterologous (foreign) to the cell. The promoter may be a constitutive or inducible promoter. Expression of the recombinase by an inducible promoter will allow out-recombination of the sequence located between the site-specific recombination sites to be controlled, for example including the recombinase encoding sequence.

The promoter may be a constitutive or inducible promoter.

Examples of inducible promoters that can be used are a starch-, cellulose-, hemicellulose (such as xylan- and/or xylose-inducible), copper-, oleic acid-inducible promoters. The promoter may be selected from the group, which includes but is not limited to promoters obtained from the polynucleotides encoding A. oryzae TAKA amylase, Rhizomucor miehei aspartic proteinase, A. niger neutral alpha-amylase, A. niger acid stable alpha-amylase, A. niger or A. awamori glucoamylase (glaA), A. niger or A. awamori endoxylanase (xInA) or beta-xylosidase (xInD), T. reesei cellobiohydrolase I (CBHI), R. miehei lipase, A. oryzae alkaline protease, A. oryzae triose phosphate isomerase, A. nidulans acetamidase, Fusarium venenatum amyloglucosidase (WO 00/56900), Fusarium venenatum Dania (WO 00/56900), Fusarium venenatum Quinn (WO 00/56900), Fusarium oxysporum trypsin-like protease (WO 96/00787), Trichoderma reesei beta-glucosidase, Trichoderma reesei cellobiohydrolase I, Trichoderma reesei cellobiohydrolase II, Trichoderma reesei endoglucanase I, Trichoderma reesei endoglucanase II, Trichoderma reesei endoglucanase III, Trichoderma reesei endoglucanase IV, Trichoderma reesei endoglucanase V, Trichoderma reesei xylanase I, Trichoderma reesei xylanase II, Trichoderma reesei beta-xylosidase, as well as the NA2-tpi promoter (a hybrid of the promoters from the polynucleotides encoding A. niger neutral alpha-amylase and A. oryzae triose phosphate isomerase), and mutant, truncated, and hybrid promoters thereof. Other examples of promoters are the promoters described in WO2006/092396 and WO2005/100573, which are herein incorporated by reference. An even other example of the use of promoters is described in WO2008/098933. Other examples of inducible (heterologous) promoters are the alcohol inducible promoter alcA, the tet system using the tetracycline-responsive promoter, the estrogen-responsive promoter (Pachlinger et al. (2005), Appl & Environmental Microbiol 672-678).

The control sequences may also include suitable transcription terminator (terminator) sequence, a sequence recognized by a filamentous fungal cell to terminate transcription. The terminator sequence is operably linked to the 3′-terminus of the nucleic acid sequence encoding the polypeptide. Any terminator, which is functional in the cell, may be used in the present invention.

The control sequence may also be a suitable leader sequence (leaders), a non-translated region of an mRNA which is important for translation by the filamentous fungal cell. The leader sequence is operably linked to the 5′-terminus of the nucleic acid sequence encoding the polypeptide. Any leader sequence, which is functional in the cell, may be used in the present invention.

Depending on the host, suitable leaders may be obtained from the polynucleotides encoding A. oryzae TAKA amylase and A. nidulans triose phosphate isomerase and A. niger GlaA and phytase.

Other control sequences may be isolated from the Penicillium IPNS gene, or pcbC gene, the beta tubulin gene. All the control sequences cited in WO 01/21779 are herewith incorporated by reference.

The control sequence may also be a polyadenylation sequence, a sequence which is operably linked to the 3′-terminus of the nucleic acid sequence and which, when transcribed, is recognized by the filamentous fungal cell as a signal to add polyadenosine residues to transcribed mRNA. Any polyadenylation sequence, which is functional in the cell, may be used in the present invention.

As set out herein, in a method of the invention, the two or more nucleic acids, taken together, may comprise a sequence encoding a marker so that recombination of the two or more nucleic acids results in the said marker-encoding sequence being located between the homologous recombination sites.

Recombination of the two or more nucleic acids may result in the said marker-encoding sequence being located between the site-specific recombination sites such that the marker may be out-recombined on expression of the recombinase.

Any suitable marker may be used and such markers are well-known to determine whether a nucleic acid is included in a cell. Typically, a marker, such as a selectable marker, permits easy selection of transformed cells. A selectable marker is a gene the product of which provides for biocide or viral resistance, resistance to heavy metals, prototrophy to auxotrophs, and the like.

Examples of marker genes include, but are not limited to, (1) nucleic acid segments that encode products that provide resistance against otherwise toxic compounds (e.g., antibiotics); (2) nucleic acid segments that encode products that are otherwise lacking in the recipient cell (e.g., essential products, tRNA genes, auxotrophic markers); (3) nucleic acid segments that encode products that suppress the activity of a gene product; (4) nucleic acid segments that encode products that can be readily identified (e.g., phenotypic markers such as antibiotic resistance markers (e.g., β-lactamase), β-galactosidase, fluorescent or other coloured markers, such as green fluorescent protein (GFP), yellow fluorescent protein (YFP), red fluorescent protein (RFP) and cyan fluorescent protein (CFP), and cell surface proteins); (5) nucleic acid segments that bind products that are otherwise detrimental to cell survival and/or function; (6) nucleic acid segments that otherwise inhibit the activity of any of the nucleic acid segments as described in 1-5 above (e.g., antisense oligonucleotides); (7) nucleic acid segments that bind products that modify a substrate (e.g., restriction endonucleases); (8) nucleic acid segments that can be used to isolate or identify a desired molecule (e.g., specific protein binding sites); (9) nucleic acid segments that encode a specific nucleotide sequence that can be otherwise non-functional (e.g., for PCR amplification of subpopulations of molecules); (10) nucleic acid segments that, when absent, directly or indirectly confer resistance or sensitivity to particular compounds; (11) nucleic acid segments that encode products that either are toxic or convert a relatively non-toxic compound to a toxic compound (e.g., Herpes simplex thymidine kinase, cytosine deaminase) in recipient cells; (12) nucleic acid segments that inhibit replication, partition or heritability of nucleic acid molecules that contain them; (13) nucleic acid segments that encode conditional replication functions, e.g., replication in certain hosts or host cell strains or under certain environmental conditions (e.g., temperature, nutritional conditions, and the like); and/or nucleic acid segments that encode an essential gene.

A selectable marker for use in a filamentous fungal cell may be selected from the group including, but not limited to, amdS (acetamidase), argB (ornithine carbamoyltransferase), bar (phosphinothricinacetyltransferase), bleA (phleomycin binding), hygB (hygromycinphosphotransferase), niaD (nitrate reductase), pyrG (orotidine-5′-phosphate decarboxylase), sC (sulfate adenyltransferase), NAT or NTC (Nourseothricin) and trpC (anthranilate synthase), as well as equivalents from other species. Preferred for use in an Aspergillus, and Penicillium cell are the amdS (see for example EP 635574 B1, EP0758020A2, EP1799821A2, WO 97/06261A2) and pyrG genes of A. nidulans or A. oryzae and the bar gene of Streptomyces hygroscopicus and hyg and pheo. More preferably an amdS gene is used, even more preferably an amdS gene from A. nidulans or A. niger. A most preferred selectable marker gene is the A. nidulans amdS coding sequence fused to the A. nidulans gpdA promoter (see EP 635574 B1). Other preferred AmdS markers are those described in WO2006/040358. AmdS genes from other filamentous fungi may also be used (WO 97/06261).

In the method of the invention, the in vivo recombination may be carried out in any suitable host cell, for example carried out in a prokaryotic or a eukaryotic cell. A suitable eukaryotic host cell may be a mammalian, insect, plant, fungal or algal cell. A host cell may be a microorganism or microbial host cell, for example a prokaryotic or eukaryotic host cell. Typically, the method of the invention will not be carried out in vivo in a human or animal.

Typically, a host cell used in the method according to the invention may be one suitable for the production of a compound of interest and the selection of the host cell may be made according to such use. For example, if the compound of interest produced in a host cell according to the invention is to be used in food applications, a host cell may be selected from a food-grade organism such as Saccharomyces cerevisiae. Specific uses include, but are not limited to, food, (animal) feed, pharmaceutical, agricultural such as crop-protection, and/or personal care applications.

The method of the invention may be used to confer on a host cell the ability to produce the compound of interest and/or to modify the way in which an existing compound of interest is produced, for example to increase the production of such a compound of interest.

A microbial host cell suitable for use in the method according to the invention may be a prokaryotic cell. Preferably, the prokaryotic host cell is a bacterial cell. The term “bacterial cell” includes both Gram-negative and Gram-positive microorganisms. Suitable bacteria may be selected from e.g. Escherichia, Anabaena, Caulobactert, Gluconobacter, Rhodobacter, Pseudomonas, Paracoccus, Bacillus, Brevibacterium, Corynebacterium, Rhizobium (Sinorhizobium), Flavobacterium, Klebsiella, Enterobacter, Lactobacillus, Lactococcus, Methylobacterium, Staphylococcus or Streptomyces. Preferably, the bacterial cell is selected from the group consisting of B. subtilis, B. amyloliquefaciens, B. licheniformis, B. puntis, B. megaterium, B. halodurans, B. pumilus, G. oxydans, Caulobactert crescentus CB 15, Methylobacterium extorquens, Rhodobacter sphaeroides, Pseudomonas zeaxanthinifaciens, Paracoccus denitrificans, E. coli, C. glutamicum, Staphylococcus carnosus, Streptomyces lividans, Sinorhizobium melioti and Rhizobium radiobacter.

A host cell suitable for use in the invention may be a eukaryotic host cell. Such a eukaryotic cell may be a mammalian, insect, plant, fungal, or algal cell. Preferred mammalian cells include e.g. Chinese hamster ovary (CHO) cells, COS cells, 293 cells, PerC6 cells, and hybridomas. Preferred insect cells include e.g. Sf9 and Sf21 cells and derivatives thereof. More preferably, the eukaryotic cell is a fungal cell, for example a yeast cell, such as Candida, Hansenula, Kluyveromyces, Pichia, Saccharomyces, Schizosaccharomyces, or Yarrowia strain. More preferably from Kluyveromyces lactis, S. cerevisiae, Hansenula polymorpha, Yarrowia lipolytica and Pichia pastoris. Most preferably, the eukaryotic cell is a filamentous fungal cell.

Filamentous fungi include all filamentous forms of the subdivision Eumycota and Oomycota (as defined by Hawksworth et al., In, Ainsworth and Bisby's Dictionary of The Fungi, 8th edition, 1995, CAB International, University Press, Cambridge, UK). The filamentous fungi are characterized by a mycelial wall composed of chitin, cellulose, glucan, chitosan, mannan, and other complex polysaccharides. Vegetative growth is by hyphal elongation and carbon catabolism is obligately aerobic. Filamentous fungal strains include, but are not limited to, strains of Acremonium, Agaricus, Aspergillus, Aureobasidium, Chrysosporium, Coprinus, Cryptococcus, Filibasidium, Fusarium, Geosmithia, Humicola, Magnaporthe, Mucor, Myceliophthora, Neocallimastix, Neurospora, Paecilomyces, Penicillium, Piromyces, Phanerochaete, Pleurotus, Rasamsonia, Schizophyllum, Talaromyces, Thermoascus, Thermomyces, Thielavia, Tolypocladium, and Trichoderma.

Preferred filamentous fungal cells belong to a species of an Acremonium, Aspergillus, Chrysosporium, Myceliophthora, Penicillium, Rasamsonia, Talaromyces, Thielavia, Fusarium or Trichoderma genus, and most preferably a species of Aspergillus niger, Acremonium alabamense, Aspergillus awamori, Aspergillus foetidus, Aspergillus sojae, Aspergillus fumigatus, Talaromyces emersonii, Talaromyces thermophilus, Thermomyces lanuginosus, Thermoascus thermophilus, Thermoascus aurantiacus, Thermoascus crustaceus, Rasamsonia emersonii, Rasamsonia byssochlamyoides, Rasamsonia argillacea, Rasamsonia brevistipitata, Rasamsonia cylindrospora, Aspergillus oryzae, Chrysosporium lucknowense, Fusarium oxysporum, Myceliophthora thermophila, Trichoderma reesei, Thielavia terrestris or Penicillium chrysogenum. A more preferred host cell belongs to the genus Aspergillus, more preferably the host cell belongs to the species Aspergillus niger. When the host cell according to the invention is an Aspergillus niger host cell, the host cell preferably is CBS 513.88, CBS124.903 or a derivative thereof. A more preferred host cell belongs to the genus Penicillium, more preferably the host cell belongs to the species Penicillium chrysogenum. When the host cell according to the invention is a Penicillium chrysogenum host cell, the host cell preferably is Wisconsin 54-1255 or a derivative thereof. A more preferred host cell belongs to the genus Rasamsonia also known as Talaromyces, more preferably the host cell belongs to the species Talaromyces emersonii also known as Rasamsonia emersonii.

In the method of the invention, the in vivo recombination is carried out in a Rasamsonia cell. Accordingly, a cell for use in the invention belongs to the genus Rasamsonia also known as Talaromyces, more preferably the host cell belongs to the species Talaromyces emersonii also known as Rasamsonia emersonii. When the host cell according to the invention is a Talaromyces emersonii also known as Rasamsonia emersonii host cell, the host cell preferably is TEC-142S a single isolate of TEC-142 (CBS 124.902) or a derivative thereof.

It may be desirable to use a thermophilic or thermotolerant fungal cell which case Humicola, Rhizomucor, Myceliophthora, Rasamsonia, Talaromyces, Thermomyces, Thermoascus or Thielavia cells are preferred.

Preferred thermophilic or thermotolerant fungi are Humicola grisea var. thermoidea, Humicola lanuginosa, Myceliophthora thermophila, Papulaspora thermophilia, Rasamsonia byssochlamydoides, Rasamsonia emersonii, Rasamsonia argillacea, Rasamsonia eburnean, Rasamsonia brevistipitata, Rasamsonia cylindrospora, Rhizomucor pusillus, Rhizomucor miehei, Talaromyces bacillisporus, Talaromyces leycettanus, Talaromyces thermophilus, Thermomyces lenuginosus, Thermoascus crustaceus, Thermoascus thermophilus Thermoascus aurantiacus and Thielavia terrestris

Several strains of filamentous fungi are readily accessible to the public in a number of culture collections, such as the American Type Culture Collection (ATCC), Deutsche Sammlung von Mikroorganismen and Zellkulturen GmbH (DSM), Centraalbureau Voor Schimmelcultures (CBS), Agricultural Research Service Patent Culture Collection, Northern Regional Research Center (NRRL), and All-Russian Collection of Microorganisms of Russian Academy of Sciences, (abbreviation in Russian—VKM, abbreviation in English—RCM), Moscow, Russia. Useful strains in the context of the present invention may be Aspergillus niger CBS 513.88, CBS124.903, Aspergillus oryzae ATCC 20423, IFO 4177, ATCC 1011, CBS205.89, ATCC 9576, ATCC14488-14491, ATCC 11601, ATCC12892, P. chrysogenum CBS 455.95, P. chrysogenum Wisconsin 54-1255 (ATCC28089), Penicillium citrinum ATCC 38065, Penicillium chrysogenum P2, Thielavia terrestris NRRL8126, Talaromyces emersonii CBS 124.902, Acremonium chrysogenum ATCC 36225 or ATCC 48272, Trichoderma reesei ATCC 26921 or ATCC 56765 or ATCC 26921, Aspergillus sojae ATCC11906, Myceliophthora thermophila C1, Garg 27K, VKM-F 3500 D, Chrysosporium lucknowense C1, Garg 27K, VKM-F 3500 D, ATCC44006 and derivatives thereof.

Eukaryotic cells have at least two separate pathways (one via homologous recombination (HR) and one via non-homologous recombination (NHR)) through which nucleic acids (in particular DNA) can be integrated into the host genome. The yeast Saccharomyces cerevisiae is an organism with a preference for homologous recombination (HR). The ratio of non-homologous to homologous recombination (NHR/HR) of this organism may vary from about 0.07 to 0.007.

WO 02/052026 discloses mutants of S. cerevisiae having an improved targeting efficiency of DNA sequences into its genome. Such mutant strains are deficient in a gene involved in NHR (KU70).

Contrary to S. cerevisiae, most higher eukaryotes such as filamentous fungal cells up to mammalian cells have a preference for NHR. Among filamentous fungi, the NHR/HR ratio ranges between 1 and more than 100. In such organisms, targeted integration frequency is rather low.

Thus, to improve the efficiency of polynucletide assembly at the target locus, it is preferred that the efficiency of homologous recombination (HR) is enhanced in the host cell in the method according to the invention.

Accordingly, preferably in the method according to the invention, the host cell is, preferably inducibly, increased in its efficiency of homologous recombination (HR). Since the NHR and HR pathways are interlinked, the efficiency of HR can be increased by modulation of either one or both pathways. Increase of expression of HR components will increase the efficiency of HR and decrease the ratio of NHR/HR. Decrease of expression of NHR components will also decrease the ratio of NHR/HR The increase in efficiency of HR in the host cell of the vector-host system according to the invention is preferably depicted as a decrease in ratio of NHR/HR and is preferably calculated relative to a parent host cell wherein the HR and/or NHR pathways are not modulated. The efficiency of both HR and NHR can be measured by various methods available to the person skilled in the art. A preferred method comprises determining the efficiency of targeted integration and ectopic integration of a single vector construct in both parent and modulated host cell. The ratio of NHR/HR can then be calculated for both cell types. Subsequently, the decrease in NHR/HR ration can be calculated. In WO2005/095624, this method is extensively described.

Host cells having a decreased NHR/HR ratio as compared to a parent cell may be obtained by modifying the parent eukaryotic cell by increasing the efficiency of the HR pathway and/or by decreasing the efficiency of the NHR pathway. Preferably, the NHR/HR ratio thereby is decreased at least twice, preferably at least 4 times, more preferably at least 10 times. Preferably, the NHR/HR ratio is decreased in the host cell of the vector-host system according to the invention as compared to a parent host cell by at least 5%, more preferably at least 10%, even more preferably at least 20%, even more preferably at least 30%, even more preferably at least 40%, even more preferably at least 50%, even more preferably at least 60%, even more preferably at least 70%, even more preferably at least 80%, even more preferably at least 90% and most preferably by at least 100%.

According to one embodiment, the ratio of NHR/HR is decreased by increasing the expression level of an HR component. HR components are well-known to the person skilled in the art. HR components are herein defined as all genes and elements being involved in the control of the targeted integration of polynucleotides into the genome of a host, said polynucleotides having a certain homology with a certain pre-determined site of the genome of a host wherein the integration is targeted.

The ratio of NHR/HR may be decreased by decreasing the expression level of an NHR component. NHR components are herein defined as all genes and elements being involved in the control of the integration of polynucleotides into the genome of a host, irrespective of the degree of homology of said polynucleotides with the genome sequence of the host. NHR components are well-known to the person skilled in the art. Preferred NHR components are a component selected from the group consisting of the homolog or ortholog for the host cell of the vector-host system according to the invention of the yeast genes involved in the NHR pathway: KU70, KU80, RAD50, MRE11, XRS2, LIG4, LIF1, NEJ1 and SIR4 (van den Bosch et al., 2002, Biol. Chem. 383: 873-892 and Allen et al., 2003, Mol. Cancer Res. 1:913-920). Most preferred are one of KU70, KU80, and LIG4 and both KU70 and KU80. The decrease in expression level of the NHR component can be achieved using the methods well known to those skilled in the art.

Since it is possible that decreasing the expression of components involved in NHR may result in adverse phenotypic effects, it is preferred that in the host cell of the vector-host system according to the invention, the increase in efficiency in homologous recombination is inducible. This can be achieved by methods known to the person skilled in the art, for example by either using an inducible process for an NHR component (e.g. by placing the NHR component behind an inducible promoter) or by using a transient disruption of the NHR component, or by placing the gene encoding the NHR component back into the genome.

Preferably, when the host cell used in the methods according to the invention is a filamentous fungal host cell, the host cell which has been modified in its genome such that it results in a deficiency in the production of at least one non-ribosomal peptide synthase, preferably a non-ribosomal peptide synthase according to the invention, more preferably a non-ribosomal peptide synthase npsE (see International patent application no. WO2012/001169) additionally comprises one or more modifications in its genome in a polynucleotide encoding a product selected from the group of glucoamylase (glaA), acid stable alpha-amylase (amyA), neutral alpha-amylase (amyBI and amyBII), oxalic acid hydrolase (oahA), a toxin, preferably ochratoxin and/or fumonisin, and protease transcriptional regulator prtT such that the host cell is deficient in at least one product encoded by the polynucleotide comprising the modification.

Therefore the fungal host cell additionally comprises modifications in its genome such that it is deficient in at least one of glucoamylase (glaA), acid stable alpha-amylase (amyA), neutral alpha-amylase (amyBI and amyBII), oxalic acid hydrolase (oahA), a toxin, such as ochratoxin and fumonisin, preferably ochratoxin and/or fumonisin, more preferably ochratoxin A and/or fumonisin B2, and protease transcriptional regulator prtT. Preferably, the host cell additionally comprises one or more modifications in its genome in a polynucleotide encoding the major extracellular aspartic protease PepA such that the host cell is deficient in the major aspartic protease PepA. For example the host cell according to the invention may further comprise a disruption of the pepA gene encoding the major extracellular aspartic protease PepA. Preferably the host cell according to the invention additionally comprises one or more modifications in its genome in a polynucleotide encoding the hdfA gene such that the host cell is deficient in hdfA. For example the host cell according to the invention may further comprise a disruption of the hdfA gene or other genes involved in the process of NHEJ (as also described in WO06/040312).

Preferably the host cell additionally may comprise at least two substantially homologous DNA domains suitable for integration of one or more copies of a polynucleotide encoding a compound of interest wherein at least one of the at least two substantially homologous DNA domains is adapted to have enhanced integration preference for the polynucleotide encoding a compound of interest compared to the substantially homologous DNA domain it originates from, and wherein the substantially homologous DNA domain where the adapted substantially homologous DNA domain originates from has a gene conversion frequency that is at least 10% higher than one of the other of the at least two substantially homologous DNA domains. These cells have been described in WO2011/009700. Strains containing two or more copies of these substantially homologous DNA domains are also referred hereafter as strain containing two or more amplicons. Examples of host cells comprising such amplicons are e.g. described in van Dijck et al, 2003, Regulatory Toxicology and Pharmacology 28; 27-35: On the safety of a new generation of DSM Aspergillus niger enzyme production strains. In van Dijck et al, an Aspergillus niger strain is described that comprises 7 amplified glucoamylase gene loci, i.e. 7 amplicons. In this context preferred host cells which may contain two or more amplicons belong to a species of a Acremonium, Agaricus, Aspergillus, Aureobasidium, Chrysosporium, Coprinus, Cryptococcus, Filibasidium, Fusarium, Geosmithia, Humicola, Magnaporthe, Mucor, Myceliophthora, Neocallimastix, Neurospora, Paecilomyces, Penicillium, Piromyces, Phanerochaete, Pleurotus, Rasamsonia, Schizophyllum, Talaromyces, Thermoascus, Thermomyces, Thielavia, Tolypocladium, and Trichoderma.

Preferred host cells within this context are filamentous fungus host cells, preferably A. niger host cells, comprising two or more amplicons, preferably two or more ΔglaA amplicons (preferably comprising 3, 4, 5, 6, 7 ΔglaA amplicons) wherein the amplicon which has the highest frequency of gene conversion, has been adapted to have enhanced integration preference for the polynucleotide encoding a compound of interest compared to the amplicon it originates from. Adaptation of the amplicon can be performed according to any one of the methods described in WO2011/009700 (which is here fully incorporated by reference). An example of these host cells, described in WO2011/009700, are host cells comprising three ΔglaA amplicons being a BamHI truncated amplicon, a SalI truncated amplicon and a Bg/II truncated amplicon and wherein the BamHI amplicon has been adapted to have enhanced integration preference for a polynucleotide encoding a compound of interest compared to the BamHI amplicon it originates from. Host cells comprising two or more amplicons wherein one amplicon has been adapted to have enhanced integration preference for a polynucleotide encoding a compound of interest compared to the amplicon it originates from are hereafter referred as host cells comprising an adapted amplicon.

Preferably, the host cell according to the invention additionally comprises a modification of Sec61. A preferred SEC61 modification is a modification which results in a one-way mutant of SEC61; i.e. a mutant wherein the de novo synthesized protein can enter the ER via SEC61, but the protein cannot leave the ER via SEC61. Such modifications are extensively described in WO2005/123763. Most preferably, the SEC 61 modification is the S376W mutation in which Serine 376 is replaced by Tryptophan.

A preferred filamentous fungal host cell used in the method according to the invention, deficient in a non-ribosomal peptide synthase, preferably deficient in a non-ribosomal peptide synthase according to the invention, more preferably in a non-ribosomal peptide synthase npsE (see WO2012/001169) additionally is deficient in pepA, glucoamylase (glaA), acid stable alpha-amylase (amyA), neutral alpha-amylase (amyBI and amyBII) and oxalic acid hydrolase (oahA). Another preferred host cell, deficient in a non-ribosomal peptide synthase, preferably a non-ribosomal peptide synthase as defined above additionally is deficient in pepA, glucoamylase (glaA), acid stable alpha-amylase (amyA), neutral alpha-amylase (amyBI and amyBII), oxalic acid hydrolase (oahA) and hdfA. Another preferred host cell, deficient in a non-ribosomal peptide synthase, preferably a non-ribosomal peptide synthase as defined above additionally is deficient in pepA, glucoamylase (glaA), acid stable alpha-amylase (amyA), neutral alpha-amylase (amyBI and amyBII), oxalic acid hydrolase (oahA), a toxin, such as ochratoxin and/or fumonisin and hdfA. Another preferred host cell, deficient in a non-ribosomal peptide synthase preferably a non-ribosomal peptide synthase as defined above, additionally is deficient in pepA, glucoamylase (glaA), acid stable alpha-amylase (amyA), neutral alpha-amylase (amyBI and amyBII), oxalic acid hydrolase (oahA), a toxin, such as ochratoxin and/or fumonisin and hdfA. Preferably, these host cells are also deficient in prtT. Therefore another preferred host cell, deficient in a non-ribosomal peptide synthase, preferably a non-ribosomal peptide synthase as defined above, additionally is deficient in pepA, glucoamylase (glaA), acid stable alpha-amylase (amyA), neutral alpha-amylase (amyBI and amyBII), oxalic acid hydrolase (oahA), a toxin, such as ochratoxin and/or fumonisin, prtT and hdfA.

Another preferred host cells, deficient in a non-ribosomal peptide synthase, preferably deficient in a non-ribosomal peptide synthase according to the invention, more preferably in a non-ribosomal peptide synthase npsE (see WO2012/001169) additionally is deficient in pepA, glucoamylase (glaA), acid stable alpha-amylase (amyA), neutral alpha-amylase (amyBI and amyBII), oxalic acid hydrolase (oahA), ochratoxin, fumonisin, prtT, hdfA and comprises a SEC 61 modification being a S376W mutation in which Serine 376 is replaced by Tryptophan.

Preferably these host cells are filamentous fungal cells, more preferably A. niger host cells comprising an adapted amplicon as defined above. Therefore the host cells used in the method according to the invention, deficient in a non-ribosomal peptide synthase, preferably deficient in a non-ribosomal peptide synthase according to the invention, more preferably in a non-ribosomal peptide synthase npsE (see WO2012/001169) are filamentous fungus host cells, preferably A. niger host cells additionally deficient in pepA, glucoamylase (glaA), acid stable alpha-amylase (amyA), neutral alpha-amylase (amyBI and amyBII) and oxalic acid hydrolase (oahA) and comprising an adapted amplicon as defined above. Another preferred filamentous fungus host cell such as an A. niger host cell, deficient in a non-ribosomal peptide synthase, preferably deficient in a non-ribosomal peptide synthase according to the invention, more preferably in a non-ribosomal peptide synthase npsE (see WO2012/001169) additionally is deficient in pepA, glucoamylase (glaA), acid stable alpha-amylase (amyA), neutral alpha-amylase (amyBI and amyBII), oxalic acid hydrolase (oahA) and hdfA and comprises an adapted amplicon as defined above. Another preferred filamentous fungus host cell such as an A. niger host cell, deficient in a non-ribosomal peptide synthase, preferably deficient in a non-ribosomal peptide synthase according to the invention, more preferably in a non-ribosomal peptide synthase npsE (see WO2012/001169) additionally is deficient in pepA, glucoamylase (glaA), acid stable alpha-amylase (amyA), neutral alpha-amylase (amyBI and amyBII), oxalic acid hydrolase (oahA), one or more toxins, preferably ochratoxin and/or fumonisin and hdfA and comprises an adapted amplicon as defined above. Another preferred filamentous fungus host cell such as an A. niger host cell, deficient in a non-ribosomal peptide synthase, preferably deficient in a non-ribosomal peptide synthase according to the invention, more preferably in a non-ribosomal peptide synthase npsE (see WO2012/001169) additionally is deficient in pepA, glucoamylase (glaA), acid stable alpha-amylase (amyA), neutral alpha-amylase (amyBI and amyBII), oxalic acid hydrolase (oahA), one or more toxins, preferably ochratoxin and/or fumonisin and hdfA and comprises an adapted amplicon as defined above. Another preferred filamentous fungus host cell such as an A. niger host cell, deficient in a non-ribosomal peptide synthase, preferably deficient in a non-ribosomal peptide synthase according to the invention, more preferably in a non-ribosomal peptide synthase npsE (see WO2012/001169) additionally is deficient in pepA, glucoamylase (glaA), acid stable alpha-amylase (amyA), neutral alpha-amylase (amyBI and amyBII), oxalic acid hydrolase (oahA), one or more toxins, preferably ochratoxin and/or fumonisin prtT and hdfA and comprises an adapted amplicon as defined above.

Another preferred filamentous fungus host cell such as an A. niger host cells, deficient in a non-ribosomal peptide synthase preferably deficient in a non-ribosomal peptide synthase according to the invention, more preferably in a non-ribosomal peptide synthase npsE (see WO2012/001169) additionally is deficient in pepA, glucoamylase (glaA), acid stable alpha-amylase (amyA), neutral alpha-amylase (amyBI and amyBII), oxalic acid hydrolase (oahA), one or more toxins, preferably ochratoxin and/or fumonisin, prtT, hdfA, comprises a SEC 61 modification being a S376W mutation in which Serine 376 is replaced by Tryptophan and comprises an adapted amplicon as defined above.

These and other possible host modifications are also described in WO2012/001169, WO2011/009700, WO2007/062936, WO2006/040312 or WO2004/070022.

Typically, in the invention, the host cell will be one which produces a compound of interest. The host cell may be capable of producing the compound of interest prior to application of the method of the invention. In this case, the method of the invention may be used to modify the target locus so that production of the compound of interest by the host cell is altered, for example production may be increased. Alternatively, the host cell may be one which produces the compound of interest as a result of application of the method of the invention.

Accordingly, the host cell preferably comprises a recombinant polynucleotide construct comprising a polynucleotide encoding a compound involved in the synthesis of a compound of interest. The polynucleotide may also directly encode a compound of interest. The recombinant polynucleotide construct encoding a compound of interest or a polypeptide involved in the synthesis of a biological compound of interest may be located on an extrachromosomal vector or at a locus in the genome of the host cell.

A host cell of the invention may be capable of producing a desired compound, such as an enzyme, which optionally may be encoded by a recombinant nucleic acid introduced into the cell.

Typically, such a host cell may harbour one or more genes capable of expressing a cellulase, hemicellulase and/or pectinase. The one or more nucleic acid sequence capable of expressing a cellulase, hemicellulase and/or pectinase may include cellobiohydrolase, endoglucanase and/or beta-glucosidase gene. A suitable cellobiohydrolyse is cellobiohydrolase I and/or cellobiohydrolase II.

Typically then, in the invention, the host cell will be one which produces a compound of interest. The host cell may be capable of producing the compound of interest prior to application of the method of the invention. In this case, the method of the invention may be used to modify the target locus so that production of the compound of interest by the host cell is altered, for example production may be increased. Alternatively, the host cell may be one which produces the compound of interest as a result of application of the method of the invention.

Accordingly, the host cell preferably comprises a recombinant polynucleotide construct comprising a polynucleotide encoding a compound involved in the synthesis of a compound of interest. The polynucleotide may also directly encode a compound of interest. The recombinant polynucleotide construct encoding a compound of interest or a polypeptide involved in the synthesis of a biological compound of interest may be located on an extrachromosomal vector or at a locus in the genome of the host cell.

The compound of interest may a primary metabolite, secondary metabolite, a biopolymer such as a peptide or polypeptide or it may include biomass comprising the host cell itself. The compound may be encoded by a single polynucleotide or a series of polynucleotides composing a biosynthetic or metabolic pathway or may be the direct product of a single polynucleotide or may be products of a series of polynucleotides. The biological compound may be native to the host cell or heterologous. The biological compound may be modified according WO2010/102982.

The term “heterologous biological compound” is defined herein as a biological compound which is not native to the cell; or a native biological compound in which structural modifications have been made to alter the native biological compound.

The term “metabolite” encompasses both primary and secondary metabolites; the metabolite may be any metabolite. Preferred metabolites are citric acid, gluconic acid and succinic acid, antibiotics, bioactive drugs, biofuels and building blocks of biomaterials.

The metabolite may be encoded by one or more genes, such as in a biosynthetic or metabolic pathway. Primary metabolites are products of primary or general metabolism of a cell, which are concerned with energy metabolism, growth, and structure. Secondary metabolites are products of secondary metabolism (see, for example, R. B. Herbert, The Biosynthesis of Secondary Metabolites, Chapman and Hall, New York, 1981).

The primary metabolite may be, but is not limited to, an amino acid, carboxylic acid, fatty acid, nucleoside, nucleotide, sugar, triglyceride, or vitamin.

The term “biopolymer” is defined herein as a chain (or polymer) of identical, similar, or dissimilar subunits (monomers). The biopolymer may be any biopolymer. The biopolymer may for example be, but is not limited to, a nucleic acid, polyamine, polyol, polypeptide (or polyamide), or polysaccharide.

The biopolymer may be a polypeptide. The polypeptide may be any polypeptide having a biological activity of interest. The term “polypeptide” is not meant herein to refer to a specific length of the encoded product and, therefore, encompasses peptides, oligopeptides, and proteins. Polypeptides further include naturally occurring allelic and engineered variations of the above-mentioned polypeptides and hybrid polypeptides. The polypeptides may be a modified polypeptide according WO2010/102982.

The polynucleotide of interest according to the invention may encode an enzyme involved in the synthesis of a primary or secondary metabolite, such as organic acids, carotenoids, antibiotics, anti-cancer drug, pigments isoprenoids, alcohols, fatty acids and vitamins. Such metabolite may be considered as a biological compound according to the present invention.

The compound of interest may be an organic compound selected from glucaric acid, gluconic acid, glutaric acid, adipic acid, succinic acid, tartaric acid, oxalic acid, acetic acid, lactic acid, formic acid, malic acid, maleic acid, malonic acid, citric acid, fumaric acid, itaconic acid, levulinic acid, xylonic acid, aconitic acid, ascorbic acid, kojic acid, coumeric acid, an amino acid, a poly unsaturated fatty acid, ethanol, 1,3-propane-diol, ethylene, glycerol, xylitol, carotene, astaxanthin, lycopene and lutein.

Alternatively, the compound of interest may be a β-lactam antibiotic such as Penicillin G or Penicillin V and fermentative derivatives thereof, a cephalosporin, cyclosporin or lovastatin. The secondary metabolite may be an antibiotic, antifeedant, attractant, bacteriocide, fungicide, hormone, insecticide, or rodenticide. Preferred antibiotics are cephalosporins and beta-lactams.

The biopolymer may be a polysaccharide. The polysaccharide may be any polysaccharide, including, but not limited to, a mucopolysaccharide (e.g., heparin and hyaluronic acid) and nitrogen-containing polysaccharide (eg., chitin). In a more preferred option, the polysaccharide is hyaluronic acid.

The compound of interest may be a peptide selected from an oligopeptide, a polypeptide, a (pharmaceutical or industrial) protein and an enzyme. In such processes the peptide is preferably secreted from the host cell, more preferably secreted into the culture medium such that the peptide may easily be recovered by separation of the host cellular biomass and culture medium comprising the peptide, e.g. by centrifugation or (ultra)filtration.

The polypeptide may be any polypeptide having a biological activity of interest. The term “polypeptide” is not meant herein to refer to a specific length of the encoded product and, therefore, encompasses peptides, oligopeptides, and proteins. Polypeptides further include naturally occurring allelic and engineered variations of the above-mentioned polypeptides and hybrid polypeptides. The polypeptides may be a modified polypeptide according WO2010/102982. The polypeptide may be native or may be heterologous to the host cell. The polypeptide may be a collagen or gelatin, or a variant or hybrid thereof. The polypeptide may be an antibody or parts thereof, an antigen, a clotting factor, an enzyme, a hormone or a hormone variant, a receptor or parts thereof, a regulatory protein, a structural protein, a reporter, or a transport protein, protein involved in secretion process, protein involved in folding process, chaperone, peptide amino acid transporter, glycosylation factor, transcription factor, synthetic peptide or oligopeptide, intracellular protein. The intracellular protein may be an enzyme such as, a protease, ceramidases, epoxide hydrolase, aminopeptidase, acylases, aldolase, hydroxylase, aminopeptidase, lipase, non-ribosomal synthetase or polyketide synthetase. The polypeptide may be an enzyme secreted extracellularly

Examples of proteins or (poly)peptides with industrial applications that may be produced in the methods of the invention include enzymes such as e.g. lipases (e.g. used in the detergent industry), proteases (used inter alia in the detergent industry, in brewing and the like, such as carboxypeptidases, endo-proteases, metallo-proteases, serine-proteases), carbohydrases and cell wall degrading enzymes (such as, amylases, glucosidases, cellulases (such as endoglucanases, β-glucanases, cellobiohydrolases, GH61 enzymes or β-glucosidases), GH61-enzymes, hemicellulases or pectinolytic enzymes, beta-1,3/4- and beta-1,6-glucanases, rhamnoga-lacturonases, mannanases, xylanases, pullulanases, galactanases, esterases and the like, used in fruit processing, wine making and the like or in feed), phytases, phospholipases, asparaginases, glycosidases (such as amylases, beta.-glucosidases, arabinofuranosidases, rhamnosidases, apiosidases and the like), dairy enzymes and products (e.g. chymosin, casein), oxidoreductases such as oxidases, transferases, or isomerases or polypeptides (e.g. poly-lysine and the like, cyanophycin and its derivatives).

Mammalian, and preferably human, polypeptides with therapeutic, cosmetic or diagnostic applications include, but are not limited to, collagen and gelatin, insulin, serum albumin (HSA), lactoferrin and immunoglobulins, including fragments thereof. The polypeptide may be an antibody or a part thereof, an antigen, a clotting factor, an enzyme, a hormone or a hormone variant, a receptor or parts thereof, a regulatory protein, a structural protein, a reporter, or a transport protein, protein involved in secretion process, protein involved in folding process, chaperone, peptide amino acid transporter, glycosylation factor, transcription factor, synthetic peptide or oligopeptide, intracellular protein. The intracellular protein may be an enzyme such as, a protease, ceramidases, epoxide hydrolase, aminopeptidase, acylases, aldolase, hydroxylase, aminopeptidase, lipase.

According to the present invention, a polypeptide can also be a fused or hybrid polypeptide to which another polypeptide is fused at the N-terminus or the C-terminus of the polypeptide or fragment thereof. A fused polypeptide is produced by fusing a nucleic acid sequence (or a portion thereof) encoding one polypeptide to a nucleic acid sequence (or a portion thereof) encoding another polypeptide.

Techniques for producing fusion polypeptides are known in the art, and include, ligating the coding sequences encoding the polypeptides so that they are in frame and expression of the fused polypeptide is under control of the same promoter (s) and terminator. The hybrid polypeptides may comprise a combination of partial or complete polypeptide sequences obtained from at least two different polypeptides wherein one or more may be heterologous to the host cell.

The compound of interest may also be the product of a selectable marker. A selectable marker is a product of a polynucleotide of interest which product provides for biocide or viral resistance, resistance to heavy metals, prototrophy to auxotrophs, and the like. Selectable markers include, but are not limited to, amdS (acetamidase), argB (ornithinecarbamoyltransferase), bar (phosphinothricinacetyltransferase), hygB (hygromycin phosphotransferase), niaD (n itratereductase), pyrG (orotidine-5′-phosphate decarboxylase), sC (sulfate adenyltransferase), trpC (anthranilate synthase), ble (phleomycin resistance protein), as well as equivalents thereof.

When the compound of interest is a biopolymer as defined earlier herein, the host cell may already be capable to produce the biopolymer. The host cell may also be provided with a recombinant homologous or heterologous polynucleotide construct that encodes a polypeptide involved in the production of the biological compound of interest. The person skilled in the art knows how to modify a microbial host cell such that it is capable of production of the compound involved in the production of the biological compound of interest.

The term “recombinant polynucleotide” herein refers to a nucleic acid molecule, either single- or double-stranded, which has been introduced into a Rasamsonia cell, for example a nucleic acid which is present in the cell in a form or at a locus in which it would not normally be present (in relation to a corresponding cell not comprising the recombinant polynucleotide).

The term “recombinant polynucleotide construct” is herein referred to as a nucleic acid molecule, either single- or double-stranded, which is isolated from a naturally occurring gene or which has been modified to contain segments of nucleic acid which are combined and juxtaposed in a manner which would not otherwise exist in nature. The term recombinant polynucleotide construct is synonymous with the term “expression cassette” when the nucleic acid construct contains all the control sequences required for expression of a coding sequence, wherein said control sequences are operably linked to said coding sequence. Suitable control sequences are described herein.

A host cell of the invention may comprise one or more recombinant polynucleotides or recombinant polyncleotide constructs in order that a compound of interest may be produced.

In order to facilitate expression, the polynucleotide encoding the polypeptide involved in the production of the compound of interest may be a synthetic polynucleotide. The synthetic polynucleotides may be optimized in codon use, preferably according to the methods described in WO2006/077258 or WO2008/000632. WO2008/000632 addresses codon-pair optimization. Codon-pair optimization is a method wherein the nucleotide sequences encoding a polypeptide have been modified with respect to their codon-usage, in particular the codon-pairs that are used, to obtain improved expression of the nucleotide sequence encoding the polypeptide and/or improved production of the encoded polypeptide. Codon pairs are defined as a set of two subsequent triplets (codons) in a coding sequence (CDS).

Furthermore, standard molecular cloning techniques such as DNA isolation, gel electrophoresis, enzymatic restriction modifications of nucleic acids, Southern analyses, transformation of cells, etc., are known to the skilled person and are for example described by Sambrook et al. (1989) “Molecular Cloning: a laboratory manual”, Cold Spring Harbor Laboratories, Cold Spring Harbor, N.Y. and Innis et al. (1990) “PCR protocols, a guide to methods and applications” Academic Press, San Diego.

A nucleic acid may be amplified using cDNA, mRNA or alternatively, genomic DNA, as a template and appropriate oligonucleotide primers according to standard PCR amplification techniques. The nucleic acid so amplified can be cloned into an appropriate vehicle and characterized by DNA sequence analysis.

A host cell (transformant) according to the invention may be cultured using procedures known in the art. For each combination of a promoter and a host cell, culture conditions are available which are conducive to the expression the DNA sequence encoding the polypeptide. After reaching the desired cell density or titre of the polypeptide the culture is stopped and the polypeptide is recovered using known procedures.

The fermentation medium can comprise a culture medium containing a carbon source (e.g. glucose, maltose, molasses, starch, cellulose, xylan, pectin, lignocellolytic biomass hydrolysate, etc.), a nitrogen source (e.g. ammonium sulphate, ammonium nitrate, ammonium chloride, etc.), an organic nitrogen source (e.g. yeast extract, malt extract, peptone, etc.) and inorganic nutrient sources (e.g. phosphate, magnesium, potassium, zinc, iron, etc.). Optionally, an inducer (e.g. cellulose, pectin, xylan, maltose, maltodextrin or xylogalacturonan) may be included.

The selection of the appropriate medium may be based on the choice of expression host and/or based on the regulatory requirements of the expression construct. Such media are known to those skilled in the art. The medium may, if desired, contain additional components favouring the transformed expression hosts over other potentially contaminating microorganisms.

The fermentation can be performed over a period of from about 0.5 to about 30 days. It may be a batch, fed-batch, or continuous process, suitably at a temperature in the range of, for example, from about 20 to about 90° C., preferably 20-55° C. more preferably 40-50° C. and/or at a pH, for example, from about 2 to about 8, preferably from about 3 to about 5. The appropriate conditions are usually selected based on the choice of the expression host and the polypeptide to be expressed.

After fermentation, if necessary, the cells can be removed from the fermentation broth by means of centrifugation or filtration. After fermentation has stopped or after removal of the cells, the polypeptide of the invention may then be recovered and, if desired, purified and isolated by conventional means.

A nucleic acid may be amplified using cDNA, mRNA or alternatively, genomic DNA, as a template and appropriate oligonucleotide primers according to standard PCR amplification techniques. The nucleic acid so amplified can be cloned into an appropriate vehicle and characterized by DNA sequence analysis.

For the purpose of this invention, it is defined here that in order to determine the percent identity of two amino acid sequences or of two nucleic acid sequences, the complete sequences are aligned for optimal comparison purposes. In order to optimize the alignment between the two sequences gaps may be introduced in any of the two sequences that are compared. Such alignment is carried out over the full length of the sequences being compared. The identity is the percentage of identical matches between the two sequences over the reported aligned region.

A comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. The skilled person will be aware of the fact that several different computer programs are available to align two sequences and determine the homology between two sequences (Kruskal, J. B. (1983). An overview of sequence comparison In D. Sankoff and J. B. Kruskal, (ed.), Time warps, string edits and macromolecules: the theory and practice of sequence comparison, pp. 1-44, Addison Wesley). The percent identity between two amino acid sequences can be determined using the Needleman and Wunsch algorithm for the alignment of two sequences. (Needleman, S. B. and Wunsch, C. D. (1970) J. Mol. Biol. 48, 443-453). The algorithm aligns amino acid sequences as well as nucleotide sequences. The Needleman-Wunsch algorithm has been implemented in the computer program NEEDLE. For the purpose of this invention the NEEDLE program from the EMBOSS package was used (version 2.8.0 or higher, EMBOSS: The European Molecular Biology Open Software Suite (2000) Rice, P. Longden, I. and Bleasby, A. Trends in Genetics 16, (6) pp. 276-277, eboss.bioinformatics.nl). For protein sequences, EBLOSUM62 is used for the substitution matrix. For nucleotide sequences, EDNAFULL is used. Other matrices can be specified. For purpose of the invention, the parameters used for alignment of amino acid sequences are a gap-open penalty of 10 and a gap extension penalty of 0.5. The skilled person will appreciate that all these different parameters will yield slightly different results but that the overall percentage identity of two sequences is not significantly altered when using different algorithms.

The protein sequences mentioned herein can further be used as a “query sequence” to perform a search against sequence databases, for example to identify other family members or related sequences. Such searches can be performed using the BLAST programs. Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information (www.ncbi.nlm.nih.gov). BLASTP is used for amino acid sequences and BLASTN for nucleotide sequences. In the BLAST program, the default settings may be used:

-   -   Cost to open gap: default=5 for nucleotides/11 for proteins     -   Cost to extend gap: default=2 for nucleotides/1 for proteins     -   Penalty for nucleotide mismatch: default=−3     -   Reward for nucleotide match: default=1     -   Expect value: default=10     -   Wordsize: default=11 for nucleotides/28 for megablast/3 for         proteins

The nucleic acid sequences as mentioned herein can further be used as a “query sequence” to perform a search against public databases to, for example, identify other family members or related sequences. Such searches can be performed using the NBLAST and XBLAST programs (version 2.0) of Altschul, et al. (1990) J. Mol. Biol. 215:403-10. BLAST nucleotide searches can be performed with the NBLAST program, score=100, word-length=12 to obtain nucleotide sequences homologous to the nucleic acid molecules of the invention.

The sequence information as provided herein should not be so narrowly construed as to require inclusion of erroneously identified bases. The specific sequences disclosed herein can be readily used to isolate the complete gene from filamentous fungi, in particular A. niger which in turn can easily be subjected to further sequence analyses thereby identifying sequencing errors.

Unless otherwise indicated, all nucleotide sequences determined by sequencing a DNA molecule herein were determined using an automated DNA sequencer and all amino acid sequences of polypeptides encoded by DNA molecules determined herein were predicted by translation of a nucleic acid sequence determined as above. Therefore, as is known in the art for any DNA sequence determined by this automated approach, any nucleotide sequence determined herein may contain some errors. Nucleotide sequences determined by automation are typically at least about 90% identical, more typically at least about 95% to at least about 99.9% identical to the actual nucleotide sequence of the sequenced DNA molecule. The actual sequence can be more precisely determined by other approaches including manual DNA sequencing methods well known in the art. As is also known in the art, a single insertion or deletion in a determined nucleotide sequence compared to the actual sequence will cause a frame shift in translation of the nucleotide sequence such that the predicted amino acid sequence encoded by a determined nucleotide sequence will be completely different from the amino acid sequence actually encoded by the sequenced DNA molecule, beginning at the point of such an insertion or deletion.

The person skilled in the art is capable of identifying such erroneously identified bases and knows how to correct for such errors.

A reference herein to a patent document or other matter which is given as prior art is not to be taken as an admission that that document or matter was known or that the information it contains was part of the common general knowledge as at the priority date of any of the claims.

The disclosure of each reference set forth herein is incorporated herein by reference in its entirety.

The present invention is further illustrated by the following Examples:

It should be understood that these Examples, while indicating preferred embodiments of the invention, are given by way of illustration only. From the above discussion and these Examples, one skilled in the art can ascertain the essential characteristics of this invention, and without departing from the spirit and scope thereof, can make various changes and modifications of the invention to adapt it to various usages and conditions. Thus, various modifications of the invention in addition to those shown and described herein will be apparent to those skilled in the art from the foregoing description. Such modifications are also intended to fall within the scope of the appended claims.

Strains

WT 1: This Aspergillus niger strain is used as a wild-type strain. This strain is deposited at the CBS Institute under the deposit number CBS 513.88.

GBA302: The strain Aspergillus niger GBA 302 (ΔglaA, ΔpepA, ΔhdfA) is used herein as recipient strain in transformations. Construction of GBA 302 is described in WO2011009700.

The Rasamsonia emersonii (R. emersonii) strains used herein are derived from ATCC16479, which is used as wild-type strain. ATCC16479 was formerly also known as Talaromyces emersonii and Penicillium geosmithia emersonii. Upon the use of the name Rasamsonia emersonii also Talaromyces emersonii is meant. Other strain designations of R. emersonii ATCC16479 are CBS393.64, IFO31232 and IMI116815.

Rasamsonia (Talaromyces) emersonii strain TEC-142 is deposited at CENTRAAL BUREAU VOOR SCHIMMELCULTURES, Uppsalalaan 8, P.O. Box 85167, NL-3508 AD Utrecht, The Netherlands on 1 Jul. 2009 having the Accession Number CBS 124902. TEC-142S is a single isolate of TEC-142.

Molecular Biology Techniques

In these strains, using molecular biology techniques known to the skilled person (see: Sambrook & Russell, Molecular Cloning: A Laboratory Manual, 3rd Ed., CSHL Press, Cold Spring Harbor, N.Y., 2001), several genes were over expressed and others were down regulated as described below. Examples of the general design of expression vectors for gene over expression and disruption vectors for down-regulation, transformation, use of markers and selective media can be found in for example WO199846772, WO199932617, WO2001121779, WO2005095624, EP 635574B and WO2005100573.

Media and Solutions

Potato Dextrose Agar, PDA, (Fluka, Cat. No. 70139)

Potato extract 4 g/l Dextrose 20 g/l Bacto agar 15 g/l pH 5.4 Water Adjust to one liter Sterilize 20 min at 120° C. Minimal Medium Agar Plates

8.8 g glucose, 6.6 g agarose, H₂O to 400 ml. Autoclave 20 minutes at 115° C. and cool to 55° C. Add Solution I, mix and pour plates.

Solution I

11 ml stock A, 11 ml stock B, 0.44 ml stock trace elements (1000×), 4.4 ml Penicillin/Streptomycin Solution, 13.2 ml H₂O.

Stock A

120 g NaNO₃, 10.4 g KCl, 30.4 g KH₂PO₄, 22.5 ml 4M KOH, H₂O to 500 ml. Autoclave 20 minutes at 120° C.

Stock B (40×)

10.4 g MgSO₄.7H₂O, H₂O to 500 ml. Autoclave 20 minutes at 120° C.

Stock Trace Elements (1000×)

2.2 g ZnSO₄.7H₂O, 1.1 g H₃BO₃, 0.5 g FeSO₄.7H₂O, 0.17 g CoCl₂.6H₂O, 0.16 g CuSO₄.5H₂O, 0.5 g MnCl₂.4H₂O, 0.15 g Na₂MoO₄.2H₂O, 5.0 g EDTA.

Dissolve EDTA and ZnSO4.7H2O in 75 ml of milliQ water and set the pH to 6.0 with NaOH 1M. Whilst maintaining the pH at 6.0 dissolve the components one by one. When ready set the pH to 4.0 with HCl 1 M, and adjust the volume to 100 ml with milliQ water. Autoclave 20 minutes at 120° C.

Rasamsonia Agar Medium

Salt fraction no. 3 15 g Cellulose (3%) 30 g BACTO ™ peptone 7.5 g Grain flour 15 g KH₂PO₄ 5 g CaCl2•2aq 1 g Bacto agar 20 g pH 6.0 Water Adjust to one liter Sterilize 20 min at 120° C. Salt Fraction Composition

The “salt fraction no. 3” was fitting the disclosure of WO98/37179, Table 1. Deviations from the composition of this table were CaCl2.2 aq 1.0 g/l, KCl 1.8 g/L, citric acid 1 aq 0.45 g/L (chelating agent).

Shake Flask Media for Rasamsonia

Rasamsonia Medium 1

Glucose 20 g/L Yeast extract (Difco) 20 g/L Clerol FBA3107 (AF) 4 drops/L pH 6.0 Sterilize 20 min at 120° C. Rasamsonia Medium 2

Salt fraction no. 3 15 g Cellulose 20 g BACTO ™ peptone 4 g Grain flour 7.5 g KH₂PO₄ 10 g CaCl₂•2H2O 0.5 g Clerol FBA3107 (AF) 0.4 ml pH 5 Water Adjust to one liter Sterilize 20 min at 120° C. Spore Batch Preparation for Rasamsonia

Strains were grown from stocks on Rasamsonia agar medium in 10 cm diameter Petri dishes for 5-7 days at 40° C. For MTP fermentations, strains were grown in 96-well plates containing Rasamsonia agar medium. Strain stocks were stored at −80° C. in 10% glycerol.

Chromosomal DNA Isolation

Strains were grown in YGG medium (per liter: 8 g KCl, 16 g glucose.H20, 20 ml of 10% yeast extract, 10 ml of 100× pen/strep, 6.66 g YNB+amino acids, 1.5 g citric acid, and 6 g K₂HPO₄) for 16 hours at 42° C., 250 rpm, and chromosomal DNA was isolated using the DNEASY® plant mini kit (Qiagen, Hilden, Germany).

MTP Fermentation of Rasamsonia

96 wells microtiter plates (MTP) with sporulated Rasamsonia strains were used to harvest spores for MTP fermentations. To do this, 200 μl of Rasamsonia medium 1 was added to each well and after resuspending the mixture, 100 μl of spore suspension was incubated in humidity shakers (Infors) for 44° C. at 550 rpm, and 80% humidity for 16 hours. Subsequently, 50 μl of pre-culture was used to inoculate 250 μl of Rasamsonia medium 2 in MTP plates. The 96-well plates were incubated in humidity shakers (Infors) for 44° C. at 550 rpm, and 80% humidity for 6 days. Plates were centrifuged and supernatants were harvested.

Protein Analysis

Protein samples were separated under reducing conditions on NuPAGE® 4-12% Bis-Tris gel (Invitrogen, Breda, The Netherlands) and stained as indicated. Gels were stained with either INSTANTBLUE™ (Expedeon, Cambridge, United Kingdom), SIMPLYBLUE™ safestain (Invitrogen, Breda, The Netherlands) or SYPRO® Ruby (Invitrogen, Breda, The Netherlands)) according to manufacturer's instructions.

For Western blotting, proteins were transferred to nitrocellulose. The nitrocellulose filter was blocked with TBST (Tris buffered saline containing 0.1% TWEEN® 40) containing 3% skim-milk and incubated for 16 hours with anti-FLAG M2 antibody (Sigma, Zwijndrecht, The Netherlands). Blots were washed twice with TBST for 10 minutes and stained with Horseradish-peroxidase conjugated rabbit-anti-mouse antibody (DAKO, Glostrup, Denmark) for 1 hour. After washing the blots five times with TBST for 10 minutes, proteins were visualized using SUPERSIGNAL® (Pierce, Rockford, U.S.A).

Enzyme Activity Measurements

Proline Specific Endoprotease Activity

The proteolytic activity of the proline specific endoprotease is spectrophoto-metrically measured at 410 nm in time using CBZ-Gly(cine)-Pro(line)-pNA at 37° C. in a citrate/disodium phosphate buffer at pH 5. 1 U proline specific endoprotease is defined as the amount of enzyme which converts 1 μmol (micromol) CBZ-Gly(cine)-Pro(line)-pNA per min at pH 5 and 37° C. at the conditions described above.

Cellulase Assay: Wheat Straw Assay (WSU Assay).

Preparation of Pre-Treated, Washed Wheat Straw Substrate

Dilute-acid pre-treated wheat straw which was washed with water until the solution with wheat straw was pH 6.5 or higher and the mass was homogenised using an ultra-turrax, lyophilized and grinded prior to analysis. To obtain pre-treated wheat straw a dilute acid pre-treatment as described in Linde, M. et al, Biomass and Bioenergy 32 (2008), 326-332 and equipment as described in Schell, D. J., Applied Biochemistry and Biotechnology (2003), vol. 105-108, pp 69-85, may be used.

Measurement of Cellulase Activity in WSU/Ml

With 1 WSU is meant 0.119 mg/ml glucose released from 2.1 w/v % washed pre-treated wheat straw by 200 μl of enzyme mix in 20 hours at 65° C. at pH 4.50.

The glucose release is not a linear function of the quantity of enzyme in the composition. In other words, twice the amount of enzyme does not automatically result in twice the amount of glucose in equal time. Therefore, it is preferred to choose the dilution of the composition to be tested for WSU activity such that a WSU does not exceed 40.

400 μl of supernatants harvested from shake flask experiments were diluted 4.5-fold. Diluted sample was used to perform two measurements in which 200 μl of diluted sample was analysed. In the first measurement, 200 μl diluted sample was transferred to a vial containing 700 μL water containing 3% (w/v) dry matter of the pretreated washed wheat straw substrate and 100 μl of 250 mM citrate buffer, the final pH was adjusted to pH 4.5. In the second measurement, the blank sample, 200 μl of diluted sample was transferred to a vial that contained 700 μl of water instead of pretreated washed wheat straw substrate, and 100 μl of 250 mM citrate buffer, the final pH was adjusted to pH 4.5. The assay samples were incubated for 20 hours at 65° C. After incubation of the assay samples, 100 μl of internal standard solution (20 g/L maleic acid, 40 g/L EDTA in D2O) was added. The amount of glucose released, was based on the signal at 5.20 ppm, relative to Dimethyl-sila-pentane-sulfonate determined by means of 1D 1H NMR operating at a proton frequency of 500 MHz, using a pulse program with water suppression, at a temperature of 27° C. The WSU number was calculated from the data by subtracting by the amount of glucose that was detected in the blank sample from the amount of glucose that was measured in the sample incubated with wheat straw.

Example 1 Construction and Description of Aspergillus Deletion Vectors

Three genes, which are candidates for disruption were identified in the genome sequence of A. niger CBS513.88. All nucleotide sequences for A. niger genes and their genomic context can be derived for example from NCBI (www.ncbi.nlm.nih.gov/) or EMBL (www.ebi.ac.uk/embl). The nicB gene is encoded by ORF An11g10910, the PdxA gene by An03g04280, whereas the epo gene is encoded by An08g04490.

Gene replacement vectors were designed according to known principles and constructed according to routine cloning procedures as also described in EP635574B and WO 98/46772. In essence, these vectors comprise approximately 1-2 kb flanking regions of the respective ORF sequences, to target for homologous recombination at the predestined genomic loci. They may contain for example the A. nidulans bi-directional amdS selection marker, the hygromycin B marker or the phleomycin selection marker for transformation. The method applied for gene replacements in all examples herein uses linear DNA, which integrates into the genome at the homologous locus of the flanking sequences by a double cross-over, thus substituting the gene to be deleted by a marker gene (such as the amdS gene). Loss of the amdS marker for example can be selected for by plating on fluoro-acetamide media.

Based on genomic sequences, gene replacement vectors for nicB and PdxA and epo were designed as follows: In essence, the nicB deletion vector pDELNicB-3 comprises approximately a 1000 bp 5′ upstream flanking region (Nic-US) and a 1000 bp 3′ downstream flanking region (Nic-DS) of the nicB ORF to allow targeting for homologous recombination at the predestined genomic nicB locus. In addition, pDELNicB-3 contains the hygromycinB selection marker cassette (from pAN7-1, NCBI gi: 475166) and mutant loxP sites (lox66 and lox71, SEQ ID Nos: 1 and 2 respectively) were placed around the HygB marker as indicated (for general layout of pDELNicB-3 see FIG. 1).

The pDEL_PdxA-2 vector for pdxA deletion was constructed like-wise with 5′ flanking regions (Pdx-US) and 3′ flanking region (Pdx_DS) of similar length for the PdxA ORF. In contrast to pDEL_NicB-3, the pDEL_PdxA-2 vector comprises the phleomycin selection marker (phleomycin marker as in pAN8-1, NCBI gi: 475899) with mutant LoxP sites (lox66 and lox71, SEQ ID No: 1 and 2, respectively) positioned around the marker cassette. (for general layout of pDEL_PdxA-2 see FIG. 2). For reference, the double mutant lox72 site is shown in SEQ ID NO: 3.

Vectors for deletion of the epo gene were designed in a slightly different way comprising the construction and use of two different vectors. The insert fragments of both vectors together can be applied in the so-called “bipartite gene-targeting” method (Nielsen et al., 2006, 43: 54-64). This method is using two non-functional DNA fragments of a selection marker which are overlapping (see also WO2008113847 for further details of the bipartite method) together with gene-targeting sequences. Upon correct homologous recombination the selection marker becomes functional by integration at a homologous target locus. As also detailed in WO 2008113847, two different deletion vectors pDEL_EPO_Hyg-1 and pDEL_EPO_CRE-1 were designed and constructed to be able to provide the two overlapping DNA molecules for bipartite gene-targeting. The first vector pDEL_EPO_Hyg-1 (General layout as in FIG. 3) comprises a first non-functional hygB marker fragment (PgpdA-HygB sequence missing the last 27 bases of the coding sequence at the 3′ end of hygB, SEQ ID NO: 4) and at one side of the hygB cassette a Lox71 sequence site and the 5′-upstream gene flanking region of the epo ORF (EPO-US). The second pDEL_EPO_CRE-1 vector (General layout as in FIG. 4) comprises a non-functional hygB fragment (HygB-TtrpC sequence missing the first 44 bases of the coding sequence at the 5′ end of hygB, SEQ ID NO: 5) and at one side of the hygB cassette, a cre recombinase cassette, a Lox66 sequence site and the 3′-downstream gene flanking region of the epo ORF (EPO-DS). The cre recombinase cassette contains the A. nidulans xylanase A promoter, a cre recombinase and xylanase A terminator, to allow xylose-inducible expression of the cre recombinase (SEQ ID NO: 6). Upon homologous recombination, the first and second non-functional fragments become functional producing a functional hygB cassette. Both epo upstream and downstream gene flanking regions target for homologous recombination of the bipartite fragments at the predestined epo genomic locus.

In the following examples we will show that the cre-lox system as used herein is a very efficient system for gene disruption and marker removal after a single transformation. In addition, when using strains deficient in NHEJ, the bipartite gene-targeting approach combined with the cre-lox system results in a highly efficient system for making marker-free strains with defined modifications.

Example 2 Efficient Gene Deletion Using Multiple Overlapping DNA Fragments without a Functional Marker (Bipartite Gene-Targeting Approach) and a Small Overlapping Sequence

In this experiment the effect of the overlap sequence size of the non-functional marker fragments on the transformation efficiency and targeting frequency through double homologous recombination was investigated. PCR fragments, encompassing the variable hygB marker length, flanked by NicB flanking regions of 1 kb (see FIG. 5), were generated using the pDELNicB-3 plasmid as template in sufficient quantities. Protoplasts of strain GBA302 (ΔglaA, ΔpepA, ΔhdfA) were transformed with 2 μg of each PCR fragment. Transformants were selected based on hygromycin B resistance, colony purified according to standard procedures as described in EP635574B and subsequently analyzed after purification. The targeting frequency was determined using diagnostic PCR using a primer within the hygB cassette and a primer within the genomic flanking region but outside the nucleotide region for targeting (see FIG. 5). The data shown in Table 1 clearly demonstrate that with good transformation efficiencies targeting frequencies of the integrative cassettes were high and efficient for the different sizes of overlapping marker sequences. Therefore, we conclude that smaller overlapping sequences than the size of roughly 1 kb mentioned in Example 1 herein and Example 4 of WO 2008113847, have no negative effect on targeting frequencies. In this manner, generation of fragments either by PCR or DNA synthesis is simplified and therefore construction of disruption mutants is more efficient.

TABLE 1 Transformation efficiency and targeting frequencies of nicB deletion cassettes using variable length of the overlapping marker sequences Length of Nr. of Targeting overlap (bp) transformants (%) 960 236 100 750 240 95 640 88 85 380 252 100

Example 3 Simultaneous Gene Deletion Using Multiple Overlapping DNA Fragments without a Functional Marker with loxP Sites and Marker Removal after a Single Transformation Step

Use of a mutant which is deficient in a gene encoding a component involved in NHEJ, such as inactivation of at least one of the hdf genes results in a significant increase of the targeting efficiency of integration vectors through (double) homologous recombination (as earlier described in WO2005095624 and WO2008113847 for example).

In addition, increase of the targeting efficiency for homologous recombination can be obtained as described in WO2008113847. This bipartite gene-targeting method comprises providing two sets of DNA molecules of which the first set comprises DNA molecules each comprising a first non-functional fragment of the replacement sequence of interest flanked at its 5′-side by a DNA sequence substantially homologous to a sequence of the chromosomal DNA flanking the target sequence and the second set comprises DNA molecules each comprising a second non-functional fragment of the DNA replacement sequence of interest overlapping with the first non-functional fragment and flanked at its 3′-side by a DNA sequence substantially homologous to a sequence of the chromosomal DNA flanking the target sequence, wherein the first and second non-functional fragments become functional upon recombination.

Gene replacement vectors pDEL_EPO_Hyg-1 and pDEL_EPO_CRE-1 (layouts as described in Example 1) both comprise approximately a 1 kb flanking region for homologous recombination at the epo ORF. In addition, they both contain a (non functional) hygB selection marker and a loxP site (lox71 or lox66). The pDEL_EPO_CRE-1 construct also contains the bacteriophage P1 Cre gene under control of the A. nidulans xylanase A promoter to allow inducible Cre expression upon xylose induction.

The two linear bipartite gene-targeting fragments for epo disruption were generated by PCR in sufficient quantities using the pDEL_EPO_Hyg-1 and pDEL_EPO_CRE-1 plasmids as template. The overlap of the two nucleotide fragments at the non-functional hygB gene was around 1 kb in this case. For each fragment, 2 μg of DNA was used to transform Aspergillus niger GBA302. Transformants were selected based on hygromycin B resistance, colony purified according to standard procedures as described in EP635574B and subsequently analyzed after purification. From Example 2, it was learned that the majority of the transformants obtained with a flanking sequence of 1 kb and an overlap of 1 kb should result in a high frequency of targeted integration at the homologous epo locus, thus substituting the target locus by the functional hygB gene as depicted in FIG. 6.

For inducing the cre-recombinase under control of the xylanase promoter, minimal medium agar plates containing 1% xylose and 1% glucose (xylanase inducing medium) were used. Transformants were transferred from PDA plates to xylanase induction medium. Subsequently, the plates were incubated for 5 days at 30° C. When Cre recombinase is induced by xylose, deletion of the DNA cassette in between the two specific loxP sites can occur by excision. Resulting colonies after growth on xylanase inducing medium were tested for their hygromycin B resistance. Spores from the transformants were transferred to PDA plates with and without hygromycin B (60 μg/ml) using toothpicks. The plates were incubated for 48 hours at 30° C.

Of 24 initial hygromycin B resistant colonies after growth on PDA starch, 4 transformants lost their hygromycin B resistance spontaneously (see also FIG. 7 for strain testing). Of 24 initial hygromycin B resistant colonies after growth on xylose, 19 transformants lost their hygromycin B resistance. Loss of hygromycin B resistance likely is coupled to loss of the hygB marker cassette through cre recombinase activity. Indeed marker removal was confirmed by PCR analysis of the epo locus.

This Example shows that in a strain deficient in NHEJ, use of bipartite gene-targeting and combination with an inducible recombination system according the invention allows for a very efficient strain construction/disruption in building marker-free strains without the need of a second transformation or counter-selection procedures in strain construction.

Example 4 Simultaneous Multiple Gene Deletions Using Multiple Overlapping DNA Fragments without Functional Markers and Multiple Marker Removal after a Single Transformation Step

In this Example we describe a method to significantly shorten strain construction procedures by combining the use of multiple bipartite fragments in combination with cre-lox in a NHEJ deficient host strain to obtain multiple gene deletions. To facilitate multiple marker removal in a single transformation step, it is essential that at least one construct contains the Cre gene with the inducible xylanaseA promoter.

Two times two linear bipartite gene-targeting fragments for pdxA and epo disruption, respectively, were generated by PCR in sufficient quantities using the pDEL_Pdx-2 and pDEL_EPO_Hyg-1 & pDEL_EPO_CRE-1 plasmids as template. The overlap of the two nucleotide fragments at the non-functional phleomycin ble gene was around 350 bp and for the hygB gene it was around 1 kb. For each of the four fragments, 2 μg of DNA was used to transform Aspergillus niger GBA302. Double deletion transformants were selected on a medium containing both hygromycin B and phleomycin. Colony purified strains were tested for correct phenotype and diagnosed by PCR for gene replacement of PdxA and epo. Upon induction of CRE by switching to a xylose containing growth medium, both selection markers were removed. Marker removal was confirmed by PCR analysis of the NicB and PdxA loci.

This Example shows that in a strain deficient in NHEJ, use of multiple bipartite gene-targeting and combination with an inducible recombination system according the invention allows for a very efficient strain construction/disruption in building marker-free strains with two modifications without the need of a second or third transformation step or counter-selection procedures in strain construction.

Example 5 Transformation of Rasamsonia emersonii Resulting in Selection Marker-Free Transformants Capable of Producing a Desired Enzyme which is Encoded by a Gene Introduced in the Transformant

This Example describes the construction of a marker-free R. emersonii transformant containing one or more additional copies of Cbhl. The marker is removed by transient expression of cre-recombinase in R. emersonii transformants.

Cloning of Transient Expression Plasmid pEBA513 Encoding Cre Recombinase

pEBA513 was constructed by DNA2.0 (Menlo Park, USA) and contains the following components: expression cassette consisting of the A. niger glaA promoter, ORF encoding cre-recombinase (AAY56380) and A. nidulans niaD terminator; expression cassette consisting of the A. nidulans gpdA promoter, ORF encoding hygromycin B resistance protein and P. chrysogenum penDE terminator (Genbank: M31454.1, nucleotides 1750-2219); pAMPF21 derived vector containing the AMA1 region and the CAT chloramphenicol resistance gene. FIG. 8 represents a map of pEBA513.

Transformation of R. emersonii with pDEL PdxA-2 and Cbhl Expression Construct pGBTOP205

In order to obtain a R. emersonii strain overexpressing Cbhl, R. emersonii was transformed to obtain a multicopy Cbhl strain. Plasmid pGBTOPEBA205, described in WO2011\054899, encoding R. emersonii Cbhl driven by the A. niger glucoamylase promoter was used in the transformation. R. emersonii transformation was performed according to the protocol described in WO2011\054899. R. emersonii was co-transformed with 1 μg of pDEL_pPdxA-2 (for cloning details and description see Example 1 and FIG. 2) and 9 μg of pGBTOPEBA205 and co-transformants were identified using PCR analysis. The presence of pDEL_PdxA-2 plasmid was determined using primer Ble-For (SEQ ID NO: 7) and Ble-Rev (SEQ ID NO: 8) and of pGBTOPEBA205 with primer EBA205-For (SEQ ID NO: 9 and EBA205-Rev (SEQ ID NO: 10). Primers directed against pGBTOPEBA4 (SEQ ID NO: 11 and 12) and pGBTOPEBA8 (SEQ ID NO: 13 and 14) were used as a control.

Ble-For (SEQ ID NO: 7): 5′-AGTTGACCAGTGCCGTTCC-3′;  and Ble-Rev (SEQ ID NO: 8): 5′-CACGAAGTGCACGCAGTTG-3′. EBA205-For (SEQ ID NO: 9): 5′-CTTCTGCTGAGCAGCTCTGCC-3′; and EBA205-Rev (SEQ ID NO: 10): 5′-GTTCAGACCGCAAGGAAGGTTG-3′. EBA4-For (SEQ ID NO: 11): 5′-CGAGAACCTGGCCTACTCTCC-3′ EBA4-Rev (SEQ ID NO: 12): 5′-CAGAGTTGTAGTCGGTGTCACG-3′ EBA8-For (SEQ ID NO: 13): 5′-GAAGGGTATCAAGAAGCGTGCC-3′ EBA8-Rev (SEQ ID NO: 14): 5′-GCCGAAGTTGTGAGGGTCAATG-3′

PCR conditions for the reactions: 50 μl reaction mix with 5 μl of template DNA, 20 pmol of each primer, 0.2 mM of dNTPs, 1× PHUSION® HF buffer and 1 U of PHUSION® DNA-Polymerase, according to PHUSION® High-Fidelity DNA Polymerase Manual (Finnzymes, Espoo, Finland), 30 s denaturation at 98° C., amplification in 30 cycles (10 s 98° C., 10 s 55° C., 15 s 72° C.), and a final incubation of 10 min at 72° C.

Transformant A-A4 is a co-transformant that contains one or more copies of pGBTOPEBA205. In lane 4, the expected 452 bp PCR product of pGBTOPEBA-205 bp was observed in the transformant (FIG. 9, lane 4), which is detected in the control PCR in which pGBTOPEBA205 was used as a template (lane 10), but not in the empty strain (lane 7). In the EBA4 and EBA8 PCR reactions, no specific bands were observed in the transformants, but the expected PCR products of 256 bp and 306 bp, respectively, were generated when plasmid DNA was used as template (lanes 8 and 9).

In conclusion, a R. emersonii transformants was generated carrying multiple copies of R. emersonii Cbhl.

Cellulase Activity Assay

Transformant A-A4 and control strains were fermented in MTP and supernatants and were analysed for cellulase activity in a WSU cellulase activity assay. A 1.25-fold increase in cellulase activity was observed in supernatants of transformant A-A4 compared to the empty strain, indicating that the obtained transformant with multiple R. emersonii Cbhl copies is improved in cellulase activity.

Transformation of Phleomycin Resistant R. emersonii Transformants with AMA Plasmid pEBA513 Carrying the Cre Recombinase Gene and Selection of Phleomycin-Sensitive Transformants.

Cre recombinase was transiently expressed in R. emersonii transformant A-A4 to remove the loxP-flanked phleomycin resistance gene by recombination over the lox66 and lox71 site. The transformant was transformed with milliQ water (control) or with 10 μg of pEBA513 carrying a Cre recombinase and hygromycin expression cassette. pEBA513 transformed protoplasts were plated in overlay on regeneration medium containing 50 μg/ml of hygromycin B. Hygromycin-resistant transformants were grown on PDA containing 50 μg/ml of hygromycin B to allow expression of the cre recombinase. Removal of the ble marker was tested phenotypically by growing the transformants on media with and without 10 μg/ml of phleomycin. The majority (>90%) of the transformants after transformation with pEBA513 (with the cre recombinase) were phleomycin sensitive, indicating that cre recombinase works very efficiently in R. emersonii and that transformants lost the (ble) marker upon introduction and expression of the recombinase. In FIG. 10A, examples of different transformants and empty strains on PDA with 10 μg/ml phleomycin and PDA are shown.

A subset of transformants was also analysed by PCR. Transformants were grown in YGG medium for 16 hours at 44 degrees, 250 rpm, and chromosomal DNA was isolated using the DNeasy plant mini kit (Qiagen, Hilden, Germany). Both parental strains containing the loxP-flanked ble gene and transformants in which cre recombinase was overexpressed were analysed by PCR using pdx primers directed against the flanks just outside the loxP sites:

Pdx-For (SEQ ID NO: 15): 5′-TTGAGCTGTTGCTCCGGTAG-3′;  and Pdx-Rev (SEQ ID NO: 16): 5′-CTCCGTAGTCATCGTCAATGG-3′

In addition, the presence of pEBA513 was determined by PCR using primers directed against the HygB selection marker of the plasmid:

Hyg-For (SEQ ID NO: 17): 5′-GCGTCGGTTTCCACTATC-3′ Hyg-Rev (SEQ ID NO: 18): 5′-GAGGTCGCCAACATCTTC-3′

PCR conditions for the reactions were as described above. The result of the agarose gel is presented in FIG. 10B. A specific PCR band of 2752 bp is observed in transformants containing the loxP-flanked ble expression cassette (lanes 2 and 3). In contrast, in transformants in which the ble recombinase is removed by cre recombinase a PCR fragment of 881 bp was amplified (lanes 8 and 9), indicating that the ble gene was removed by the cre recombinase. Thus, we successfully removed the loxP-flanked ble selection marker from R. emersonii transformants using the cre-lox system.

The presence of the pEBA513 AMA-Cre plasmid was determined by a HygB PCR. Interestingly, in one of the two transformant no HygB fragment was detectable. As the transformant were grown under conditions without hygB selection, the transformant probably already lost the episomal cre expression plasmid and linked to that the hygB marker.

Removal of the pEBA513 Plasmid to Obtain a Marker-Free Transformants.

After removing the ble selection marker, strains were identified that spontaneously lost the pEBA513 plasmid. We already observed that part of the transformants already lost the AMA plasmid while selecting for phleomycin-sensitive clones on PDA plates with and without phleomycin. In order to check spontaneous loss of the episomal AMA plasmid pEBA513 after growing the transformants without hygromycin selection, spores were transferred to plates with and without hygromycin B. After one round of growth without selection already 50-75% of the transformants were hygromycin B sensitive, which was confirmed by hygB PCRs as described above.

After marker removal, the transformant still contained multiple R. emersonii Cbhl copies and also the cellulase activity was still 1.25-fold improved compared to the empty strain.

In conclusion, we successfully generated marker-free R. emersonii transformants by using two dominant markers: a loxP-flanked ble marker that was used for co-transformation with a gene of interest, and a hygromycin marker that was used to transiently transform R. emersonii transformants with an AMA plasmid carrying the cre recombinase gene. Transient transformation of R. emersonii with cre recombinase was sufficient to remove the loxP-flanked ble marker.

Example 6 Identification of Rasamsonia emersonii Genes Involved in Non-Homologous End-Joining and Construction of the Deletion Vectors

Genomic DNA of Rasamsonia emersonii strain CBS393.64 was sequenced and analyzed. The genes with translated proteins annotated as homologues to known genes involved in non-homologous end-joining are listed in Table 2:

TABLE 2 Genes involved in non-homologous end-joining in Rasamsonia emersonii and their homologues in A. niger, P. chrysogenum and S. cerevisiae Genomic R. emersonii S. cerevisiae sequence cDNA protein ReKu70 Ku70 SEQ ID NO: 19 SEQ ID NO: 20 SEQ ID NO: 21 ReKu80 Ku80 SEQ ID NO: 22 SEQ ID NO: 23 SEQ ID NO: 24 ReRad50 Rad50 SEQ ID NO: 25 SEQ ID NO: 26 SEQ ID NO: 27 ReRad51 Rad51 SEQ ID NO: 28 SEQ ID NO: 29 SEQ ID NO: 30 ReRad52 Rad52 SEQ ID NO: 31 SEQ ID NO: 32 SEQ ID NO: 33 ReRad54 Rad54 SEQ ID NO: 34 SEQ ID NO: 35 SEQ ID NO: 36 ReRad54 Rad54 SEQ ID NO: 37 SEQ ID NO: 38 SEQ ID NO: 39 ReRad55 Rad55 SEQ ID NO: 40 SEQ ID NO: 41 SEQ ID NO: 42 ReRad57 Rad57 SEQ ID NO: 43 SEQ ID NO: 44 SEQ ID NO: 45 ReCDC2 CDC2 SEQ ID NO: 46 SEQ ID NO: 47 SEQ ID NO: 48 ReLIG4 LIG4 SEQ ID NO: 49 SEQ ID NO: 50 SEQ ID NO: 51 ReMRE11 MRE11 SEQ ID NO: 52 SEQ ID NO: 53 SEQ ID NO: 54

Sequences of the R. emersonii genes involved in non-homologous end-joining, comprising the genomic sequences of the open reading frames (ORF) (with introns) and approximately 1500 bp of the 5′ and 3′ flanking regions, cDNA and protein sequences.

Two replacement vectors for ReKu80, pEBA1001 and pEBA1002, were constructed according to routine cloning procedures (see FIGS. 11 and 12). The insert fragments of both vectors together can be applied in the so-called “bipartite gene-targeting” method (Nielsen et al., 2006, 43: 54-64). This method is using two non-functional DNA fragments of a selection marker which are overlapping (see also WO2008113847 for further details of the bipartite method) together with gene-targeting sequences. Upon correct homologous recombination the selection marker becomes functional by integration at a homologous target locus. The deletion vectors pEBA1001 and pEBA1002 were designed as described in WO 2008113847, to be able to provide the two overlapping DNA molecules for bipartite gene-targeting.

The pEBA1001 vector comprises a 2500 bp 5′ flanking region of the ReKu80 ORF for targeting in the ReKu80 locus, a lox66 site, and the non-functional 5′ part of the ble coding region driven by the A. nidulans gpdA promoter (PgpdA-ble sequence missing the last 104 bases of the coding sequence at the 3′ end of ble, SEQ ID NO: 60) (FIG. 11). The pEBA1002 vector comprises the non-functional 3′ part of the ble coding region and the A. nidulans trpC terminator (ble-TtrpC sequence missing the first 12 bases of the coding sequence at the 5′ end of ble, SEQ ID NO: 61), the A. nidulans trpC terminator, a lox71 site, and a 2500 bp 3′ flanking region of the ReKu80 ORF for targeting in the ReKu80 locus (FIG. 12).

Example 7 Inactivation of the ReKu80 Gene in Rasamsonia emersonii

Linear DNA of the deletion constructs pEBA1001 and pEBA1002 were isolated and used to transform Rasamsonia emersonii strain TEC-142S using method as described earlier in WO2011\054899. These linear DNAs can integrate into the genome at the ReKu80 locus, thus substituting the ReKu80 gene by the ble gene as depicted in FIG. 13. Transformants were selected on phleomycin media and colony purified and tested according to procedures as described in WO2011\054899. Growing colonies were diagnosed by PCR for integration at the ReKu80 locus using a primer in the gpdA promoter of the deletion cassette and a primer directed against the genomic sequence directly upstream of the 5′ targeting region. From a pool of approximately 250 transformants, 4 strains showed a removal of the genomic ReKu80 gene.

Subsequently, 3 candidate ReKu80 knock out strains were transformed with pEBA513 to remove the ble selection marker by transient expression of the cre recombinase. pEBA513 transformants were plated in overlay on regeneration medium containing 50 μg/ml of hygromycin B. Hygromycin-resistant transformants were grown on PDA containing 50 μg/ml of hygromycin B to allow expression of the cre recombinase. Single colonies were plated on non-selective Rasamsonia agar medium to obtain purified spore batches. Removal of the ble marker was tested phenotypically by growing the transformants on media with and without 10 μg/ml of phleomycin. The majority (>90%) of the transformants after transformation with pEBA513 (with the cre recombinase) were phleomycin sensitive, indicating removal of the pEBA1001 and pEBA1002-based ble marker. Removal of the pEBA513 construct in ble-negative strains was subsequently diagnosed phenotypically by growing the transformants on media with and without 50 μg/ml of hygromycin. Approximately 50% of the transformants lost hygromycin resistance due to spontaneously loss of the pEBA513 plasmid.

Candidate marker-free knock-out strains were tested by Southern analysis for deletion of the ReKu80 gene. Chromosomal DNA was isolated and digested with restriction enzyme HindIII. Southern blots were hybridized with a probe directed against the 3′ region of the ReKu80 gene (FIG. 14). The following primers were used to generate the probe:

SEQ ID NO: Ku80-For: (SEQ ID NO: 55) AGGGTATATGTGGTCTAGTAACGC SEQ ID NO: Ku80-Rev: (SEQ ID NO: 56) TCACAAGTCCATCACGGAACCGGG

The expected fragment sizes in wild-type strains, phleomycin resistant ReKu80 knock-out strains and in the phleomycin sensitive strains, were, respectively, 4132 bp, 3197 bp and 1246 bp. The wild-type control strain showed the expected 4132 bp fragment (FIG. 14, lane 1). The 2 candidate phleomycin resistant ReKu80 knock out strains indeed showed the expected 3197 bp fragment (lanes 2 and 3). Removal of the ble gene by cre recombinase resulted in a size reduction of the fragment; a 1246 bp band was detectable on the Southern blot (lanes 5 and 6). In conclusion, we confirmed by Southern blotting that we obtained 2 independent marker-free ReKu80 deletion strains.

Strain deltaReKu80-2 was selected as a representative strain with the ReKu80 gene inactivated.

Example 8 Cloning of RePepA Deletion Vectors

Genomic DNA of Rasamsonia emersonii strain CBS393.64 was sequenced and analyzed. The gene with translated protein annotated as protease pepA was identified. Sequences of Rasamsonia emersonii pepA (RePepA), comprising the genomic sequence of the ORF and approximately 2500 bp of the 5′ and 3′ flanking regions, cDNA and protein sequence, are shown in sequence listings 57, 58 and 59 respectively.

Gene replacement vectors for RePepA were designed using the bipartite gene-targeting method and constructed according to routine cloning procedures (see FIGS. 15 and 16). The pEBA1005 construct comprises a 2500 bp 5′ flanking region of the RePepA ORF for targeting in the RePepA locus, a lox66 site, and the 5′ part of the ble coding region driven by the A. nidulans gpdA promoter (PgpdA-ble sequence missing the last 104 bases of the coding sequence at the 3′ end of ble, SEQ ID NO: 60). (FIG. 15). The pEBA1006 construct comprises the 3′ part of the ble coding region (ble-TtrpC sequence missing the first 12 bases of the coding sequence at the 5′ end of ble, SEQ ID NO: 61), the A. nidulans trpC terminator, a lox71 site, and a 2500 bp 3′ flanking region of the RePepA ORF for targeting in the RePepA locus (FIG. 16). In addition, pEBA10056 was constructed carrying a complete RePepA deletion cassette (FIG. 17). The pEBA10056 construct comprises a 2500 bp 5′ flanking region of the RePepA ORF for targeting in the RePepA locus, a lox66 site, the ble expression cassette containing the A. nidulans gpdA promoter, ble coding region and the A. nidulans trpC terminator, a lox71 site, and a 2500 bp 3′ flanking region of the RePepA ORF for targeting in the RePepA locus.

In addition to pEBA1005 and pEBA1006 containing 1500 bp RePepA flanks, constructs were generated consisting of 500, 1000 and 1500 bp RePepA flanks to test the optimal flank length. pEBA1005 and pEBA1006 are representative for those constructs that only differ in flank length.

Example 9 Improved Targeting for Homologous Recombination Events at the RePepA Locus

The targeting efficiency in the ReKu80 knock out strain vs a wild-type strain was assessed by transformation of TEC-142S and the deltaReKu80-2 strain with deletion vectors designed for the inactivation of the RePepA gene encoding the major extracellular acid aspartyl protease from the genome. The RePepA deletion vectors were amplified by PCR and the PCR product was used to transform protoplasts of TEC-142S and the deltaReKu80-2 strain. Transformant selection was performed as described in Example 7.

The targeting frequency was assessed by activity-based plate assays indicative of the inactivation of RePepA. The plate assays were performed by propagating transformants on PDA plates supplemented with 1% Casein sodium salt. In total 20 transformants of each transformation were analysed for halo formation. Most transformants of CBS393.64 still showed halo formation after transformation with 2.5 kb RePepA deletion constructs, whereas no halo formation was observed in transformants of deltaReKu80-2 (FIG. 18). In Table 3, the targeting frequency, as judged by halo formation on casein plates is shown.

TABLE 3 Targeting frequencies of RePepA deletion vectors with different flanking lengths in the deltaReKu80-2 strain as compared with strain CBS393.64. Deletion vectors using the bipartite gene- targeting method are indicated with “(bipartite)” Targeting (%) Flanking length TEC-142S deltaReKu80-2 2.5 kb <5 90 2.5 kb (bipartite) 10 100 1.5 kb (bipartite) 5 100 1 kb (bipartite) <5 85 0.5 kb (bipartite) <5 n.d.* *not determined because of low amount of transformants

The targeting efficiency was significantly improved in de deltaReKu80-2 strain compared to the CBS393.64 strain. In the wild-type strain highest targeting efficiencies (10%) were observed when using 2.5 flanks using the bipartite gene-targeting method. Deletion of RePepA using a plasmid carrying the complete deletion cassette was successful in 90% of the transformants of the deltaReKu80-2 strain. When using the bipartite gene-targeting method, in the deltaReKu80-2 strain 1.5 kb flanks are already sufficient to obtain 100% targeting and 1 kb flanks to obtain correct transformants with high efficiency.

These findings indicate that strains with improved efficiency for homologous recombination after inactivation of at least one of the genes involved in non homologous end joining in Rasamsonia emersonii results in a significant increase of the targeting efficiency of vectors for integration through double homologous recombination. In this example this has been illustrated for disruption of ReKU80.

Example 10 Construction of Rasamsonia Deletion Vector for Simultaneous Gene Deletion Using Multiple Overlapping DNA Fragments without a Functional Marker with loxP Sites and Marker Removal after a Single Transformation Step

Gene replacement vectors for RePepA were designed using the bipartite-targeting method as described in Example 3, with one exception: RePepA flanking regions of approximately 1500 base-pairs were used for homologous recombination at the RePepA ORF. The first vector pPepAHyg (General layout as in FIG. 19) comprises a first non-functional hygB marker fragment (PgpdA-HygB sequence missing the last 27 bases of the coding sequence at the 3′ end of hygB, SEQ ID NO: 4) and at one side of the hygB cassette a Lox71 sequence site and the 5′-upstream gene flanking region of the RePepA ORF (5′ region pepA). The second pPepACre vector (General layout as in FIG. 20) comprises a non-functional hygB fragment (HygB-TtrpC) sequence missing the first 44 bases of the coding sequence at the 5′ end of hygB, SEQ ID NO: 5) and at one side of the hygB cassette, a cre recombinase cassette, a Lox66 sequence site and the 3′-downstream gene flanking region of the RePepA ORF (3′ region RePepA). The cre recombinase cassette contains the A. nidulans xylanase A promoter, a cre recombinase and xylanase A terminator, to allow xylose-inducible expression of the cre recombinase (SEQ ID NO: 6). Upon homologous recombination, the first and second non-functional fragments become functional producing a functional hygB cassette. Both RePepA upstream and downstream gene flanking regions target for homologous recombination of the bipartite fragments at the predestined RePepA genomic locus.

In the following example we will show that the cre-lox system as used herein is a very efficient system for gene disruption and marker removal after a single transformation. In addition, when using strains deficient in NHEJ, the bipartite gene-targeting approach combined with the cre-lox system results in a highly efficient system for making marker-free strains with defined modifications.

Example 11 Efficient Gene Deletion Using Multiple Overlapping DNA Fragments without a Functional Marker (Bipartite Gene-Targeting Approach) and a Small Overlapping Sequence

Use of a mutant which is deficient in a gene encoding a component involved in NHEJ, such as inactivation of at least one of the Ku genes results in a significant increase of the targeting efficiency of integration vectors through (double) homologous recombination (see Example 9).

In addition, increase of the targeting efficiency for homologous recombination can be obtained as described in Example 9. This bipartite gene-targeting method comprises providing two sets of DNA molecules of which the first set comprises DNA molecules each comprising a first non-functional fragment of the replacement sequence of interest flanked at its 5′-side by a DNA sequence substantially homologous to a sequence of the chromosomal DNA flanking the target sequence and the second set comprises DNA molecules each comprising a second non-functional fragment of the DNA replacement sequence of interest overlapping with the first non-functional fragment and flanked at its 3′-side by a DNA sequence substantially homologous to a sequence of the chromosomal DNA flanking the target sequence, wherein the first and second non-functional fragments become functional upon recombination.

Gene replacement vectors pPepAHyg and pPepACre (layouts as described in Example 10) both comprise approximately a 1.5 kb flanking region for homologous recombination at the RePepA ORF. In addition, they both contain a (non functional) hygB selection marker and a loxP site (lox71 or lox66). The pPepACre construct also contains the bacteriophage P1 Cre gene under control of the A. nidulans xylanase A promoter to allow inducible Cre expression upon xylose induction.

The two linear bipartite gene-targeting fragments for RePepA disruption were generated by PCR in sufficient quantities using the pPepAHyg and pPepACre plasmids as template. The overlap of the two nucleotide fragments at the non-functional hygB gene was around 1 kb in this case. These linear DNAs can integrate into the genome at the RePepA locus, thus substituting the RePepA gene by the hygB gene as depicted in FIG. 21.

For each fragment, 2 μg of DNA was used to transform R. emersonii strain deltaReKu80-2. Transformants were selected based on hygromycin B resistance, colony purified according to standard procedures as described in Example 5 and subsequently analyzed after purification.

For inducing the cre-recombinase under control of the xylanase promoter, minimal medium agar plates containing 1% xylose and 1% glucose (xylanase induction medium) and 0.2% yeast extract were used. Transformants were transferred from PDA plates to xylanase induction medium with yeast extract. Subsequently, the plates were incubated for 5 days at 42° C. Resulting colonies after growth on xylose were plated on non-selective Rasamsonia agar medium to obtain purified spore batches. When Cre recombinase is induced by xylose, deletion of the DNA cassette in between the two specific loxP sites can occur by excision. Removal of the hygB marker was tested phenotypically by growing the transformants on media with and without 50 μg/ml of hygromycin B. Approximately, 65% of the cre-induced transformants were not able to grow on hygromycin B (FIG. 22). Loss of hygromycin B resistance likely is coupled to loss of the hygB marker cassette through cre recombinase activity. Indeed marker removal was confirmed by PCR analysis of the RePepA locus.

This Example shows that in a strain deficient in NHEJ, use of bipartite gene-targeting and combination with an inducible recombination system according the invention allows for a very efficient strain construction/disruption in building marker-free strains without the need of a second transformation or counter-selection procedures in strain construction.

Example 12 Knocking Out the ADE1 Gene in S. cerevisiae Using a Bipartite Marker and Cre-Recombinase with Inducible Promoter all Flanked Between Lox71 and Lox66

This procedure is set out schematically in FIG. 23. Use two basic constructs in this cloning procedure. The basic constructs can be ordered at DNA2.0 or any other commercial company providing synthetic DNA sequences. Basic construct 1 contains the lox71 site followed by a non-functional part of the KanMX marker cassette (SEQ ID NO: 62). The sequence can be cloned in a standard E. coli cloning vector used by the synthetic DNA provider. The second basic construct contains a non-functional part of the KanMX marker cassette with overlapping sequences of 50 bp towards the non-functional part of KanMX of the first construct. When both non-functional KanMX marker fragments recombine via in vivo homologous recombination a full functional KanMX marker cassette will be formed. The second construct also contains the cre-recombinase with the galactose inducible GAL promoter and lox66. The sequence of basic construct 2 is provided as SEQ ID NO: 63 in the sequence list. Perform the following steps to knock out the ADE1 gene in S. cerevisiae.

Chromosomal DNA Isolation with YeaStar Genomic DNA Kit™ (ZYMO Research)

Inoculate the S. cerevisiae CEN.PK113-7D (MATa MAL2-8c SUC2) yeast strain in 1 ml YephD (2% glucose) in a 24 well plate and incubate ON at 30° C., 550 rpm and 80% humidity in a shaker. Measure the OD660 with a biochrom Ultrospec 2000 spectrophotometer to obtain the right amount of cells (1-5×10⁷ cells) as described in the manual of the kit. Proceed with the isolation as described in Protocol II in the manual of the YeaStar Genomic DNA Kit™. After isolation, measure the concentration with a NANODROP™ ND-1000 (Thermo Scientific), concentrations are usually low, in the order of 10 ng/μl, but suitable enough for PCR purposes.

PCR Amplify and Purify Fragments for Transformation to S. Cerevisiae

First PCR amplify the fragments necessary for the transformation to S. cerevisiae. PCR fragment 1 (SEQ ID NO: 72) is a genomic integration flank upstream of the ADE1 sequence that needs to be deleted. It is amplified using the forward primer with SEQ ID NO: 64 and the reverse primer with sequence SEQ ID NO: 65.

PCR fragment 2 (SEQ ID NO: 73) is the sequence of basic construct 1 with 50 bp overlapping homologous sequences towards PCR fragment 1. It is amplified using the primers with sequence SEQ ID NO: 66 and SEQ ID NO: 67. It contains lox66 and a non-functional part of the KanMX marker cassette.

PCR fragment 3 (SEQ ID NO: 74) is the sequence of basic construct 2 containing the overlapping sequence towards the non-functional KanMX marker cassette in PCR fragment 2. It also contains the Cre recombinase expression cassette and the lox71 site. The 3′ end sequence of PCR fragment 3 contains a 50 bp overlapping sequence towards PCR fragment 4. PCR fragment 3 is amplified using the primers with sequence SEQ ID NO: 68 and sequence SEQ ID NO: 69.

PCR fragment 4 (SEQ ID NO: 75) is the genomic integration flank downstream of ADE1 sequence to delete. PCR fragment 4 is amplified using the primers with sequence SEQ ID NO: 70 and sequence SEQ ID NO: 71.

Amplify the DNA fragments with Phusion polymerase (Finnzymes) according to the manual. Use basic construct 1 and 2 for PCR reaction 2 and 3 respectively as template. Use CEN.PK113-7D genomic DNA (isolated as described previously) as template for the amplification of the 5′ and 3′ ADE1 deletion flanks. Check the size of the PCR fragments with standard agarose electrophoresis techniques. Purify the PCR amplified DNA fragments with the NucleoMag® 96 PCR magnetic beads kit of Macherey-Nagel, according to the manual and measure the DNA concentrations with the Trinean DropSense® 96 of GC biotech.

Transformation of the PCR Fragments to S. cerevisiae

Perform transformation of S. cerevisiae according to Gietz and Woods (2002; Transformation of the yeast by the LiAc/SS carrier DNNPEG method. Methods in Enzymology 350: 87-96). Transform CEN.PK113-7D (MATa MAL2-8c SUC2) with 1 μg of each of the amplified and purified PCR fragments (PCR fragments 1-4). Plate transformation mixtures on YEPhD-agar (BBL PHYTONE™ peptone 20.0 g/l, Yeast Extract 10.0 g/l, Sodium Chloride 5.0 g/l, Agar 15.0 g/l and 2% glucose) containing G418 (400 μg/ml). After 3 to 5 days of incubation at 30° C., colonies will appear on the plates, whereas the negative control (i.e., no addition of DNA in the transformation experiment) will result in blank plates. The correct KO of the ADE1 gene will give a red or pink collared phenotype of the colonies.

Efficient Out-Recombination of the Marker Cassette and Cre Recombinase

Pick six red colonies from the transformation plates and transfer with an inoculation loop to 2 ml YEP medium (Peptone 10.0 g/l, Yeast Extract 10.0 g/l, Sodium Chloride 5.0 g/l) with 20 g/l galactose and 0.5 g/l glucose in a 12 ml greiner tube. After ON incubation at 30° C. plate the cultures on YEP-agar with 2% galactose in the appropriate dilution to obtain single colonies on the plate and incubate for 2-4 days at 30° C. Pick and restreak some colonies each on an individual fresh YEPhD-agar plate and restreak the same colony on an individual YEPhD-agar plate containing G418 (400 μg/ml). The S. cerevisiae transformants will grow on the YEPhD-agar plate without the G418 and most of them will not grow on the YEPhD-agar plate containing G418, indicating loss of the KanMX marker cassette and cre-recombinase gene by recombination of the lox66 and lox71 sites induced by the cre-recombinase. Continue with the colonies that lost the marker and Cre cassette.

This is a method that enables the knockout of a gene and removal of the marker in S. cerevisiae with a fast and efficient one step procedure using a split marker and cre recombinase gene both between lox66 and lox71 sites.

Example 13 Automated and High Throughput Gene Deletion Using Multiple Overlapping DNA Fragments without a Functional Marker (Bipartite Gene-Targeting Approach) and a Small Overlapping Sequence

Based on the approach used for the epo gene in the Example 1, where two different gene replacement vectors were used, two gene replacement vectors were designed for a large set of target genes. In essence, these vectors comprise approximately 0.9-1.2 kb flanking regions of the respective ORF sequences, to target for homologous recombination at the predestined genomic loci. They contain the hygromycin B marker but also may contain for example the A. nidulans bi-directional amdS selection marker or the phleomycin selection marker for transformation.

Based on genomic sequences, gene replacement vectors for 96 selected genes were designed as follows:

A first vector pDEL_UP_Hyg-1 (General layout as in FIG. 3) comprises a first non-functional hygB marker fragment (PgpdA-HygB sequence missing the last 27 bases of the coding sequence at the 3′ end of hygB, SEQ ID NO: 4) and at one side of the hygB cassette a Lox71 sequence site and the 900-1200 bp 5′-upstream gene flanking region of the target ORF (-US). This upstream gene flanking region can be made synthetically or made by PCR using fragment-specific oligonucleotides for PCR amplification.

A second pDEL_Down_CRE-1 vector (General layout as in FIG. 4) comprises a non-functional hygB fragment (HygB-TtrpC sequence missing the first 44 bases of the coding sequence at the 5′ end of hygB, SEQ ID NO: 5) and at one side of the hygB cassette, a cre recombinase cassette, a Lox66 sequence site and the 3′-downstream gene flanking region of the targete ORF (-DS). The cre recombinase cassette is described in Example 1 above. For each specific target gene/genomic area a specific set of gene replacement vectors is made: one pDEL_UP_Hyg-1 type for the upstream fragment, one pDEL_Down_CRE-1 type or the downstream fragment. These vectors can be made through various methods as for example, gene synthesis, Gibson cloning (Gibson D G, Young L, Chuang R Y, Venter J C, Hutchison C A 3rd, Smith H O. (2009). “Enzymatic assembly of DNA molecules up to several hundred kilobases”. Nature Methods 6 (5): 343-345; Gibson D G. (2011). “Enzymatic assembly of overlapping DNA fragments”. Methods in Enzymology 498: 349-361), cloning through restriction digestion, ligation and E. coli transformation, in vivo recombination in yeast, preferably a method which can be automated and performed in MTP.

The method applied for gene replacements in this example uses linear DNA fragments, preferably made by PCR in an MTP plate, using the two different types of gene replacement vectors pDEL_UP_Hyg-1 and pDEL_Down_CRE-1 as template for each specific gene. As also detailed in WO 2008113847, these two different fragments were designed and constructed to be able to provide the two overlapping DNA molecules for bipartite gene-targeting. Therefore, linear DNA fragments are made by PCR using the respective gene-specific plasmid as template in sufficient quantities. Preferrably, the PCR fragments are mixed by using pipetting robots and made in an MTP plate. Protoplasts of strain GBA302 (ΔglaA, ΔpepA, ΔhdfA) are transformed with 2 μg of each PCR fragment.

Preferably, the transformation is done according the method of (WO 2008/000715) and performed in MTP. Transformants are selected based on hygromycin B resistance, preferably in regeneration plates with hygB. In addition to previously described large scale agar plates, preferably this plate can be an MTP plate with agar medium. A second selection step is done to colony purify the strains, which can be done according to standard procedures as described (EP635574B) or by replating on PDA medium with 60 μg/ml hygB in MTP format.

For inducing the cre-recombinase under control of the xylanase promoter, minimal medium agar plates containing 1% xylose and 1% glucose (xylanase inducing medium) were used. Transformants were transferred from PDA plates to xylanase induction medium, preferrably in MTP plate. Subsequently, the plates were incubated for 6 days at 30° C. Subsequently, spores were transferred to a new agar MTP plate plates containing 1% xylose and 1% glucose. Resulting colonies after re-growth on xylanase inducing medium were tested for their hygromycin B resistance, by testing growth on plates with and without hygromycin B (60 μg/ml) as described above.

Most colonies analyzed after purification and growth on xylanase inducing medium have lost their hygromycin B resistance. Individual transformants can be tested for their respective gene disruptions. Targeting frequency obtained through this method is similar to that of the transformations using a single type of fragment such as for nicB or epo as described above. In addition, overall success percentage obtained after fragment cloning, PCR amplification, transformation of fragment and colony purification for the 96 genes is over 90%.

In this example we show that the cre-lox system as used herein is a very efficient system for gene disruption and marker removal after a single transformation. In addition, when using strains deficient in NHEJ, the bipartite gene-targeting approach combined with the cre-lox system results in a highly efficient system for making marker-free strains with defined modifications, which very well can be automated and efficiently used for high throughput gene and genome-wide gene disruption programs, thereby generating marker-free strains which can be used in subsequent transformations. 

The invention claimed is:
 1. A method for carrying out recombination at a target locus to delete an existing sequence at the target locus, which method comprises: providing two or more nucleic acids which, when taken together, comprise: (a) sequences capable of homologous recombination with sequences flanking the target locus; (b) two or more site-specific recombination sites; (c) a sequence encoding a recombinase which recognizes the site-specific recombination sites; and (d) a sequence encoding a marker, wherein the two or more nucleic acids are capable of homologous recombination with each other so as to give rise to a single nucleic acid, and wherein at least two of the two or more nucleic acids each comprise a sequence encoding a non-functional portion of the marker; and recombining the said two or more nucleic acids with each other and with the sequences flanking the target locus so that a contiguous nucleic acid sequence encoding a functional marker, the sequence encoding the recombinase and at least two site-specific recombination sites are inserted at the target locus, said marker-encoding and/or recombinase-encoding sequence being flanked by at least two site-specific recombination sites and the said site-specific recombination sites being flanked by the sequences capable of homologous recombination with sequences flanking the target locus, wherein recombination of the nucleic acids with each other and with sequences flanking the target locus is carried out in vivo in a fungal cell, the method further comprising expressing the recombinase so that the sequence located between the site-specific recombination sites is out-recombined, wherein out-recombination of the nucleic acid sequence between site-specific recombination sites is carried out in vivo, and wherein the site-specific recombination sites are lox sites and the recombinase is Cre; wherein the site-specific recombination sites are FRT sites and the recombinase is Flp; wherein the recombination sites are Vlox sites and the recombinase is VCre; or wherein the recombination sites are Slox and the recombinase is SCre.
 2. A method according to claim 1, wherein the two or more nucleic acids, when taken together, comprise sequences capable of homologous recombination with sequences flanking two or more target loci, so that recombining the said two or more nucleic acids with each other and with the sequences flanking the target loci results in the insertion of at least two site-specific recombination sites at each target locus, wherein recombining the two or more nucleic acids results in: a contiguous sequence encoding a functional marker is present at each target locus; a sequence encoding a functional recombinase is present in at least one target locus; said marker-encoding and/or recombinase-encoding sequence located between at least two site-specific recombination sites; and the said site-specific recombination sites are flanked by the sequences capable of homologous recombination with sequences flanking the target locus.
 3. A method according to claim 1, wherein two of the at least two nucleic acids each comprise a sequence encoding a non-functional portion of the recombinase such that together they comprise nucleic acid sequence encoding a functional recombinase.
 4. A method according to claim 1, wherein the marker or markers is/are out-recombined.
 5. A method according to claim 1, wherein expression of the recombinase is controlled by an inducible promoter.
 6. A method according to claim 2, wherein the two or more nucleic acids, taken together, comprise sequences encoding at least two different markers, wherein, for each marker, at least two of the two or more nucleic acids each comprise a sequence encoding a non-functional portion of the marker, such that recombination of the two or more nucleic acids results in a different marker gene-encoding sequence being inserted at each target locus.
 7. A method according to claim 6, wherein recombination of the two or more nucleic acids results in the said marker-encoding sequences being inserted at each target locus so that they are located between site-specific recombination sites and may be out-recombined from the target loci on expression of the recombinase.
 8. A method according to claim 1, wherein the fungal cell is a yeast cell.
 9. A method according to claim 1, wherein the fungal cell is a filamentous fungal cell.
 10. A method according to claim 1, wherein the method is carried out in a cell which is variant of a parent host cell, the parent host cell having a preference for non-homologous recombination, wherein a ratio of non-homologous recombination/homologous recombination is decreased in the variant as compared to said ratio in said parent host cell measured under the same conditions.
 11. A method according to claim 1, wherein the site-specific recombination sites are such that out-recombination following recombinase expression gives rise to a single mutant site-specific recombination site at the target locus which is not recognized by the recombinase.
 12. A method according to claim 1, wherein the target locus comprises a coding sequence which is disrupted and/or partially or fully deleted.
 13. A method according to claim 1, wherein the method is carried out two or more times in parallel.
 14. A method according to claim 13, wherein each parallel reaction is carried out in a volume of about 250 μl or less.
 15. A method according to claim 8, wherein the fungal cell is a yeast cell selected from the group consisting of S. cerevisiae, Yarrowia lypolytica, and K. lactis.
 16. A method according to claim 9, wherein the filamentous fungal cell is from a species of a genus selected from the group consisting of Aspergillus, Penicillium, Talaromyces, and Trichoderma. 